On device AI

Posts

Showing posts from May, 2024

Deploying Flask server Platforms

May 28, 2024

1. Heroku : Pros: Easy to set up, free tier available, good for small to medium applications. Cons: Limited scaling options on the free tier, can become expensive for larger applications. How to Use: You can deploy your Flask app to Heroku by following their documentation. You'll need to create a Procfile and use the heroku CLI to push your app. APP TYPE : 2. AWS (Amazon Web Services): Pros: Highly scalable, extensive services (e.g., EC2, Elastic Beanstalk, Lambda), pay-as-you-go. Cons: Can be complex to set up, potential for high costs without proper management. How to Use: You can use Elastic Beanstalk for a simplified deployment or set up an EC2 instance for more control. 3. Google Cloud Platform (GCP): Pros: Scalable, integrated services, pay-as-you-go. Cons: Can be complex to set up, similar cost considerations as AWS. How to Use: Use Google App Engine for easy deployment or Google Kubernetes Engine for containerized applications. 4. Microsoft Azure: Pros: Scalable, extensive...

Challenges with Local Server

May 19, 2024

Hardware and Infrastructure Challenges Resource Requirements: Memory : LLMs require significant RAM to run efficiently. For example, models like GPT-3 can require dozens of gigabytes of RAM. Storage: The models themselves are large (many gigabytes), and additional storage is needed for caching, data storage, and logs. 1. Processing Power: High-performance CPUs and preferably GPUs are needed to handle the computational load. Inferencing with LLMs is resource-intensive and benefits greatly from parallel processing capabilities of GPUs. Scalability: 1. Floating IP : Server need Floating IPs to handle Multiple request. Our Local Machine don't have Floating IPS to handle multiple requests 2. Load Balancing: Managing multiple requests simultaneously requires efficient load balancing. Without a cloud interface, this must be handled locally. 3. Horizontal Scaling: Scaling out to multiple machines can be complex withou...

Tentative Plan For Deploying LLAMA 3 8b Locally and Creating a Local Server and End point API

May 16, 2024

S teps INVOLVED : . Set up environment: Install Python: Ensure you have Python installed on your local machine. You can download and install it from the official Python website. Create a Virtual Environment (Optional but recommended): Set up a virtual environment to isolate your project dependencies. You can create a virtual environment using venv or virtualenv: 1. Install LLAMA 3 8b: First, we need to install LLAMA 3 8b. You can follow the instructions provided by NVIDIA for installation. we have to make sure you have all the necessary dependencies installed. 2 . Set Up a Flask Server: Get use to with Flask framework : 2.1 Now, let's set up a Flask server to handle requests to the LLAMA 3 8b model. 2.2 Install Flask 2.3 Create a Flask App: Create a Python file (e.g., app.py) and initialize a Flask app. 3. Integrate LLAMA 3 8b: You need to integrate LLAMA 3 8b into the predict() function to make predictions. Since LLAMA 3 8b is a pretrain...

Medical Report Analyzer - Progess

May 13, 2024

Content : You get to know the approach what we have followed for this Medical report Analyzer project. there are different steps involved in this process and get to know which libraries are used with code snippets. Extracting the Text From the Input PDF step 1 : We have extracted text from the pdf using the Python library Pytesseract installing the usefull libraries using pip command :- ! sudo apt-get install tesseract-ocr ! pip install pytesseract pip install PyMuPDF step 2 : now code to extract text from pdf you will get the output text as : Now there are two ways to Extract the specific field details from the extracted field : Using the HardCode Method where you will Write code with many if-else condition to handle' different cases for different format of Medical Report you have to parse the text using regular expressions to extract the relevant field details like Name , Age , Gender, DOB , etc. 2. You Can use LLMs to extract the req...

Benchmarking Inception_V3

May 09, 2024

Analyzing the performance on any ML model is very crucial . while running ML model on device we need to know the performance of the ml model executed on the edge device . so that we can get the performance metrices of the different layers of ML model and some other usefull insights of different runtimes Running Benchmarks for Inception_V3 Pre-requisite : your_model.dlc A text file listing all the input data. For an example, see: $SNPE_ROOT/models/alexnet/data/image_list.txt. All the input data listed in the text file. For an example, see $SNPE_ROOT/models/alexnet/data/cropped. $cd $SNPE_ROOT/benchmark $cp alexnet_sample.json inception_v3.json $vi inception_v3.json // modify the inception_v3.json parameters according to inception_v3 { "Name":"inceptionV3", "HostRootPath": "inception_v3", "HostResultsDir":"inception_v3/results", "DevicePath":"/data/local/tmp/snpeexample", "D...