Skip to content
TII Falcon-H1 models now available on Amazon Bedrock Marketplace and SageMaker JumpStart
Source: aws.amazon.com

TII Falcon-H1 models now available on Amazon Bedrock Marketplace and SageMaker JumpStart

Sources: https://aws.amazon.com/blogs/machine-learning/tii-falcon-h1-models-now-available-on-amazon-bedrock-marketplace-and-amazon-sagemaker-jumpstart, https://aws.amazon.com/blogs/machine-learning/tii-falcon-h1-models-now-available-on-amazon-bedrock-marketplace-and-amazon-sagemaker-jumpstart/, AWS ML Blog

TL;DR

  • The Technology Innovation Institute (TII) Falcon-H1 models are now available on Amazon Bedrock Marketplace and Amazon SageMaker JumpStart. Six instruction-tuned models are offered: 0.5B, 1.5B, 1.5B-Deep, 3B, 7B, and 34B.
  • Falcon-H1 uses a parallel hybrid design that combines State Space Models (SSMs) such as Mamba with Transformers attention to deliver faster inference and lower memory usage while maintaining strong context understanding.
  • The Falcon-H1 family provides native multilingual support across 18 languages and supports up to 256K context length, available under the Falcon LLM license.
  • Deployment options include Amazon Bedrock Marketplace and SageMaker JumpStart, with guided steps, testing playgrounds, and integration with Bedrock APIs and SageMaker tools for secure deployment and fine tuning.

Context and background

The Falcon-H1 family originates from the Technology Innovation Institute (TII), a leading research institution based in Abu Dhabi and part of the UAE’s Advanced Technology Research Council (ATRC). TII focuses on AI, quantum computing, autonomous robotics, cryptography, and more. AWS and TII are collaborating to broaden access to UAE-made AI models on a global scale, enabling professionals to build and scale generative AI applications using the Falcon-H1 models. The Falcon-H1 architecture implements a parallel hybrid design that blends fast inference and lower memory footprints from State Space Models with the context understanding strength of Transformer attention. This design draws on concepts from Mamba and Transformer architectures to deliver efficiency and generalization across a broad set of tasks. The Falcon-H1 family ranges from 0.5 to 34 billion parameters and provides native support for 18 languages. In official guidance, smaller variants demonstrate notable efficiency by achieving performance parity with larger models in many scenarios. TII releases Falcon-H1 models under the Falcon LLM license to foster AI accessibility and collaboration. This licensing choice is part of TII’s mission to democratize access to high-quality AI models and to accelerate innovation across industries. The availability on Bedrock Marketplace and SageMaker JumpStart lets developers compare proprietary and public models side by side in a unified way, and to deploy with AWS infrastructure that emphasizes security, scale, and cost effectiveness. For those exploring these models, the combination of Bedrock and JumpStart provides two complementary pathways for discovery, deployment, and customization. References to the original announcement provide additional context on the collaboration and goals behind this launch. From a platform perspective, Bedrock Marketplace gives access to hundreds of models via unified Bedrock APIs, with options to select instance types and scale deployments to meet workload demands. SageMaker JumpStart provides ready-to-use architectures and workflows you can deploy through Studio, the SageMaker SDK, or the SageMaker Console, enabling rapid testing and fine-tuning in secure environments that can be isolated within a VPC. The combined offering aims to streamline the path from model discovery to production deployment across AWS regions where Bedrock and SageMaker JumpStart are available. Within the user guide material, testers can try Falcon-H1-0.5B-Instruct in the Bedrock Marketplace playground and can invoke the model via the Bedrock Converse API, including endpoint ARNs that start with arn:aws:sagemaker. The article also demonstrates how to deploy the same model through SageMaker JumpStart using the Python SDK and Studio, illustrating how MLOps workflows can be integrated into existing pipelines. Deployment guidance also covers quota considerations and cost management, including how to request increases for ml.g6.xlarge endpoints and how to monitor deployments during progression to In Service status. These deployment steps are designed to be representative for other Falcon-H1 models in the same family. For organizations evaluating whether Bedrock Marketplace or SageMaker JumpStart better fits their requirements, the original post provides guidance on choosing between Bedrock and SageMaker AI deployment options across use cases, workloads, and security contexts. The partnership between TII and AWS aims to expand access to UAE-origin AI capabilities to researchers and businesses worldwide, while maintaining robust security, governance, and cost controls. You can learn more about Bedrock and JumpStart in the AWS documentation and related resources mentioned in the post. The Falcon-H1 family emphasizes multilingual support across 18 languages and scales from 0.5B to 34B parameters, with a context length of up to 256K. This combination aims to provide efficient yet capable generative AI models suitable for a range of applications. The models are designed to work within AWS Cloud infrastructure, leveraging Bedrock and SageMaker features to support secure deployments, private networking, and cost-aware usage. You can explore these models in AWS regions where both Bedrock and SageMaker JumpStart services and the required instance types are available, aligning with regional requirements and compliance needs. For more background on the broader goals of this collaboration, you can consult the AWS Machine Learning Blog and related resources linked in the References section below.

What’s new

  • Six instruction-tuned Falcon-H1 models are now available on Amazon Bedrock Marketplace and SageMaker JumpStart: 0.5B, 1.5B, 1.5B-Deep, 3B, 7B, and 34B.
  • The Falcon-H1 family uses a parallel hybrid architecture that fuses SSMs with Transformer attention to optimize inference speed and memory usage while preserving strong understanding of context.
  • Native multilingual support spans 18 languages, with context lengths up to 256K, across the model sizes.
  • Models are released under the Falcon LLM license and accessible via Bedrock Marketplace APIs or SageMaker JumpStart deployment workflows, enabling discovery, testing, and scalable production use.
  • Practical deployment guidance is provided for Bedrock Marketplace and JumpStart, including a Bedrock playground for testing and examples using the Converse API, as well as a SageMaker Python SDK workflow for JumpStart deployments.
  • Prerequisites for Bedrock deployment include having an AWS account with sufficient quota for ml.g6.xlarge endpoints; quota increases can be requested through the AWS Service Quotas console.
  • The article demonstrates end-to-end deployment steps for Falcon-H1-0.5B-Instruct as an example, with steps applicable to other Falcon-H1 models in the series.

Model overview table

| Model | Parameters | Context length | Languages | Architecture |---|---|---|---|---| | 0.5B | 0.5B | 256K | 18 | Hybrid SSM (Mamba) + Transformer attention |1.5B | 1.5B | 256K | 18 | Hybrid SSM + Transformer attention |1.5B-Deep | 1.5B | 256K | 18 | Hybrid SSM + Transformer attention |3B | 3B | 256K | 18 | Hybrid SSM + Transformer attention |7B | 7B | 256K | 18 | Hybrid SSM + Transformer attention |34B | 34B | 256K | 18 | Hybrid SSM + Transformer attention |

Why it matters (impact for developers/enterprises)

  • Accessibility and scalability: The combination of Bedrock Marketplace and JumpStart offers a unified path to discover, compare, and deploy large language models alongside other AI assets. This helps developers evaluate the Falcon-H1 family against other available models and choose the best fit for their workloads.
  • Global reach with UAE-origin AI: The collaboration brings UAE-made AI capabilities to an international audience, aligning with regional innovation goals and the UAE National AI Strategy 2031 by enabling global access to Falcon-H1 models.
  • Efficiency and cost effectiveness: The hybrid architecture delivers efficient inference with lower memory footprints for smaller models, while maintaining strong performance for larger configurations, enabling cost-conscious deployment at scale.
  • Security and governance: Bedrock and JumpStart deployments can be configured to run within a VPC, with options for encryption keys and resource tagging to support organizational security and governance policies.
  • Multilingual capabilities: With 18 languages supported natively, Falcon-H1 models are suited to multilingual applications and global customer support scenarios.

Technical details or Implementation

The Falcon-H1 family extends from 0.5B to 34B parameters and is designed around a parallel hybrid architecture that blends SSMs with traditional attention mechanisms. SSMs such as Mamba enable faster inference and reduced memory usage, while Transformer-style attention supports context understanding and generalization. The combination offers efficiency advantages across model sizes while preserving capabilities expected from modern LLMs. The models support up to 256K context length and are designed to operate with 18 languages, enabling broad applicability across regions and use cases. Deployment paths include two primary AWS routes:

  • Amazon Bedrock Marketplace: Provides a centralized catalog of models with unified Bedrock APIs, allowing you to configure deploy settings, instance types, and security options such as VPC and encryption keys. Monitoring and management of deployment progress occur in the Managed deployments section during the provisioning phase. After deployment, you can test Falcon-H1-0.5B-Instruct directly in the Bedrock Playground and invoke the model via the Bedrock Converse API, replacing placeholders with the endpoint ARN that begins with arn:aws:sagemaker.
  • SageMaker JumpStart: Offers a JumpStart experience through SageMaker Studio, the SageMaker SDK, and the AWS Management Console. The example walkthrough demonstrates deploying Falcon-H1-0.5B-Instruct via the SageMaker Python SDK, producing a SageMaker endpoint that you can use to run inferences and integrate into your applications. The JumpStart path emphasizes end-to-end workflows from deployment to inference and integration within your existing ML pipelines. Prerequisites and operational considerations:
  • Bedrock deployment requires an AWS account with sufficient quota for ml.g6.xlarge endpoints. The default quota is often zero for endpoint usage, so a quota increase must be requested and approved before deployment.
  • After experimenting with Falcon-H1 models, it is important to delete deployed endpoints and associated resources to avoid ongoing charges. The official guidance references the SageMaker documentation for ongoing resource management.
  • While the article uses Falcon-H1-0.5B-Instruct as the example, the same deployment steps apply to other Falcon-H1 models in the family, including larger configurations where appropriate quotas and resource availability permit.

Key takeaways

  • Falcon-H1 models bring a UAE-origin LLM family to Bedrock and JumpStart, expanding access to advanced AI capabilities.
  • The hybrid SSM-Transformer design targets efficiency without sacrificing performance, with multilingual support across 18 languages and up to 256K context length.
  • Deployments can be conducted through Bedrock Playground or JumpStart with Studio and the Python SDK, depending on preferences for discovery, testing, or production workflows.
  • Licensing under the Falcon LLM license reinforces a collaborative and accessible approach to AI development while preserving governance and security controls.
  • Prerequisites and cost controls are important considerations; plan quota requests and resource cleanup as part of your deployment lifecycle.

FAQ

  • Which Falcon-H1 models are now available on Bedrock Marketplace and JumpStart?

    Six models: 0.5B, 1.5B, 1.5B-Deep, 3B, 7B, and 34B.

  • How do I deploy Falcon-H1 on Bedrock versus JumpStart?

    Bedrock offers model discovery and unified APIs with configurable deployment settings and a Bedrock Playground for testing, while JumpStart provides SageMaker Studio, the SageMaker SDK, and Console-based deployment workflows for end-to-end ML pipelines. The post demonstrates deploying Falcon-H1-0.5B-Instruct in both paths, with steps applicable to other models.

  • What prerequisites are needed for Bedrock deployment?

    n AWS account and a sufficient quota for ml.g6.xlarge endpoints; you must request a quota increase via AWS Service Quotas if the default is insufficient.

  • What capabilities do Falcon-H1 models offer?

    Multilingual support across 18 languages, up to 256K context length, and a range of model sizes from 0.5B to 34B parameters, built on a hybrid SSM and Transformer architecture and released under the Falcon LLM license.

  • How should I manage resources to avoid ongoing charges?

    Delete endpoints and related resources after experimentation, following SageMaker guidance to optimize costs and avoid unnecessary usage.

References

More news