GT Gilligan Tech
Platform · Technology

Built on the world's leading AI infrastructure.

Gilligan Tech doesn't lock you into one cloud or one model. We build on Google Vertex AI, AWS Bedrock, Azure OpenAI, and Cloudflare Workers AI — routing every task to the right model, on the right infrastructure, at the right cost.

4
Cloud Platforms
Google, AWS, Azure, Cloudflare — all under one API.
12+
AI Models
Gemini, GPT-4o, Claude, Llama, Mistral and more — routed by task type.
<100ms
Edge Latency
Cloudflare Workers AI delivers edge inference with sub-100ms response.
0
Training on Your Data
Your documents never train any model. Period.
Infrastructure

Four clouds. One platform.

Each cloud AI provider brings something distinct. We use all four, and select the best fit for each workload automatically.

Long-context reasoning · Enterprise RAG · Vision

Vertex AI is our primary platform for reasoning-heavy tasks and enterprise-scale document retrieval. Gemini's 1M-token context window makes it uniquely suited to analysing large document sets in a single pass.

  • Gemini 2.0 Flash — High-throughput inference for real-time LLM features and chat
  • Gemini 1.5 Pro — Complex document analysis with 1M-token context window
  • Vertex AI Search & RAG Engine — Managed enterprise retrieval with Grounding
  • Imagen 3 — Visual content generation for marketing assets
Open-weight models · Embeddings · Cost-optimised inference

AWS Bedrock gives us access to both proprietary and open-weight models through a single managed API. For data-sensitive customers who need model transparency, Llama on Bedrock is a privacy-first choice.

  • Claude 3.5 Sonnet (via Bedrock) — Primary reasoning model for complex multi-step tasks
  • Meta Llama 3.1 70B — Open-weight option for privacy-sensitive deployments
  • Amazon Titan Text Embeddings V2 — Vector embeddings for semantic search and RAG
  • Amazon Nova Micro — Ultra-fast, cost-efficient inference for high-volume ops
Multimodal reasoning · Speech · Enterprise compliance

Azure OpenAI gives us GPT-4o's multimodal reasoning inside Microsoft's enterprise compliance boundary — important for customers in regulated industries. Azure AI Foundry handles fine-tuning and custom deployment.

  • GPT-4o — Multimodal reasoning + vision for document and image processing
  • GPT-4o mini — Cost-efficient for high-volume classification and extraction
  • Azure AI Foundry — Managed fine-tuning, evaluation, and custom deployment
  • Whisper (Azure) — Speech-to-text for meeting notes, calls, and audio transcription
Edge inference · Sub-100ms latency · Global reach

When response time is everything, Cloudflare Workers AI runs inference at the network edge — milliseconds from any user, globally. Ideal for real-time suggestions, autocomplete, and lightweight triage before routing to a heavier model.

  • Llama 3.1 8B Instruct — Edge inference for sub-50ms response on common queries
  • Mistral 7B Instruct — Lightweight reasoning at the network edge
  • Whisper Large v3 — Audio transcription at the edge (data stays in-region)
Architecture

Intelligent model routing

The Gilligan Tech platform analyses each incoming task and routes it to the optimal model — balancing capability, latency, and cost automatically.

01
Task Classification
Is this a quick query, a long-context analysis, a multimodal task, or a real-time edge operation?
02
Model Selection
Route to Gemini 1.5 Pro, GPT-4o, Llama 70B, or Cloudflare edge — whichever wins on the task profile.
03
Inference
The selected model processes the request with your data — never leaving the approved cloud boundary.
04
Result & Audit
Results returned to your product. Every call logged with model, latency, and token cost for full auditability.
Security & Compliance

Enterprise-grade by default.

🔒

No model training on your data

Customer documents and queries are never used to train or fine-tune any model. We use inference-only API calls on all four platforms.

🌍

US data residency

All data is processed and stored within US-based cloud regions by default. Regional data residency for EU customers is available on Enterprise plans.

📄

Full audit trail

Every model call is logged with the model used, token count, latency, and response. Available for download in your dashboard at any time.

Extended Model Support

Beyond the big four clouds.

The Gilligan Tech routing layer is model-agnostic. In addition to our four primary cloud platforms, we support these providers for specialised workloads, open-weight deployments, and regional data residency requirements.

Provider Models Best for
NVIDIA NIM Llama 3.1 Nemotron · Mistral NIM Optimised GPU inference via NIM microservices; NVIDIA Inception ecosystem
IBM watsonx.ai Granite 13B · Llama 3.1 on watsonx Enterprise governance, IBM-managed infrastructure, TechXChange partner credits
Cohere Command R+ · Embed v3 RAG-optimised embeddings, reranking, enterprise retrieval pipelines
Mistral AI Mistral Large 2 · Mistral 7B EU-hosted inference, GDPR data residency, cost-efficient reasoning
Together.ai Llama 3.1 405B · Qwen 2.5 72B High-throughput open-weight inference at scale; batch workloads
Hugging Face Inference Any Inference Endpoint model Custom fine-tuned models, open-source ecosystem, private deployments
Anthropic (direct) Claude 3.5 Sonnet · Claude 3 Haiku Direct API path (non-Bedrock) for customers requiring Anthropic MSA terms
Perplexity Sonar sonar-pro · sonar-reasoning Web-grounded RAG with live search results; real-time fact retrieval

See the platform in action.

We'll walk you through a live demo tailored to your use case — document Q&A, customer support, workflow automation, or all three.