Getatlas Anduo6nbxvHugging Face

No results

Help CenterModelsDeploying Models with Inference API

Deploying Models with Inference API

Last updated July 1, 2024

Introduction: The Hugging Face Inference API allows you to deploy models quickly and efficiently, making them accessible for real-time predictions via API calls.

Steps:

  1. Setting Up Your Model on Hugging Face Hub Upload Your Model: Ensure your model is uploaded to the Hugging Face Hub. You can follow the  Hugging Face documentation  for detailed steps on uploading.
  2. Accessing the Inference API API Endpoint: Obtain your API endpoint from the Hugging Face model page. It usually looks like https://api-inference.huggingface.co/models/{username}/{model_name}.
  3. Authenticating Your API Requests
  • API Token: Generate an API token from your Hugging Face account settings.
  • Include Token in Requests: import requests API_URL = "https://api-inference.huggingface.co/models/{username}/{model_name}" headers = {"Authorization": f"Bearer {your_api_token}"}
  1. Making Predictions Send Data for Inference defquery(payload): response = requests.post(API_URL, headers=headers, json=payload) return response.json() data = {"inputs": "Your input text here"} result = query(data) print(result)
  2. Handling API Responses
  • Interpreting Results: Process the response to extract and use the predicted outputs for your application.
  • Example Response Handling: python Copy code print(f"Prediction: {result[0]['label']} with score {result[0]['score']}")
  1. Monitoring and Scaling
  • Monitor Usage: Use the Hugging Face dashboard to monitor your API usage and performance.
  • Scaling Options: Consider upgrading your plan or optimizing your model to handle increased load efficiently.

Was this article helpful?