Skip to content

Instantly share code, notes, and snippets.

@hariby
Last active February 19, 2025 11:42
Show Gist options
  • Save hariby/c6b4d1f7ceee8e976b8b752c388d7ae5 to your computer and use it in GitHub Desktop.
Save hariby/c6b4d1f7ceee8e976b8b752c388d7ae5 to your computer and use it in GitHub Desktop.
DeepSeek on AWS misc

<|begin▁of▁sentence|><|User|>A man has 53 socks in his drawer: 21 identical blue, 15 identical black and 17 identical red. The lights are out, and he is completely in the dark. How many socks must he take out to make 100 percent certain he has at least one pair of black socks?<|Assistant|>

<|begin▁of▁sentence|><|User|>カフェテリアに🍏が23個あった。そのうち20個をランチの準備に使い、新たに6個買い足した。今、カフェテリアに🍏は何個?<|Assistant|>

<|begin▁of▁sentence|><|User|>ジャネットの🪿は1日に16個の🥚を産む。彼女は毎朝3個を朝食に食べ、毎日4個を使って友人のためにマフィンを焼く。残りの🥚は1個2ドルで売る。彼女は毎日いくら稼げる?<|Assistant|>

DeepSeek on AWS ハンズオン

1. Amazon Bedrock Marketplace

  • https://catalog.us-east-1.prod.workshops.aws/join?access-code=<当日配布>
  • リージョンはデフォルト N. Virginia (us-east-1) のままで ok
  • Amazon Bedrock を開く
  • Model catalog を開く
  • DeepSeek と打ち込んでフィルター
  • DeepSeek-R1-Distill-Llama-8B などを選んでデプロイ
  • Marketplace deployments に表示されてるモデルを選び Playground で遊ぶ
  • プロンプトは例を参照
    • <|begin▁of▁sentence|><|User|>ここに文章を書きます<|Assistant|>

2. SageMaker AI

GPU

import json
import sagemaker
import boto3
from sagemaker.huggingface import HuggingFaceModel, get_huggingface_llm_image_uri

try:
	role = sagemaker.get_execution_role()
except ValueError:
	iam = boto3.client('iam')
	role = iam.get_role(RoleName='sagemaker_execution_role')['Role']['Arn']

# Hub Model configuration. https://huggingface.co/models
hub = {
	'HF_MODEL_ID':'deepseek-ai/DeepSeek-R1-Distill-Llama-8B',
	'SM_NUM_GPUS': json.dumps(1)
}



# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
	image_uri=get_huggingface_llm_image_uri("huggingface",version="3.0.1"),
	env=hub,
	role=role, 
)

# deploy model to SageMaker Inference
predictor = huggingface_model.deploy(
	initial_instance_count=1,
	instance_type="ml.g5.2xlarge",
	container_startup_health_check_timeout=300,
  )
  
# send request
predictor.predict({
	"inputs": "Hi, what can you help me with?",
})

AWS Inferentia2

import json
import sagemaker
import boto3
from sagemaker.huggingface import HuggingFaceModel, get_huggingface_llm_image_uri

try:
    role = sagemaker.get_execution_role()
except ValueError:
    iam = boto3.client("iam")
    role = iam.get_role(RoleName="sagemaker_execution_role")["Role"]["Arn"]

# Hub Model configuration. https://huggingface.co/models
hub = {
    "HF_MODEL_ID": "deepseek-ai/DeepSeek-R1-Distill-Llama-8B",
    "HF_NUM_CORES": "2",
    "HF_AUTO_CAST_TYPE": "bf16",
    "MAX_BATCH_SIZE": "8",
    "MAX_INPUT_TOKENS": "3686",
    "MAX_TOTAL_TOKENS": "4096",
}


region = boto3.Session().region_name
image_uri = f"763104351884.dkr.ecr.{region}.amazonaws.com/huggingface-pytorch-tgi-inference:2.1.2-optimum0.0.27-neuronx-py310-ubuntu22.04"

# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
    image_uri=image_uri,
    env=hub,
    role=role,
)

# deploy model to SageMaker Inference
predictor = huggingface_model.deploy(
    initial_instance_count=1,
    instance_type="ml.inf2.xlarge",
    container_startup_health_check_timeout=1800,
    volume_size=512,
)

# send request
predictor.predict(
    {
        "inputs": "What is is the capital of France?",
        "parameters": {
            "do_sample": True,
            "max_new_tokens": 128,
            "temperature": 0.7,
            "top_k": 50,
            "top_p": 0.95,
        }
    }
)

3 (optional). Bedrock Custom Model Import

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment