Medical Report Analyzer - Progess


Content :
  • You get to know the approach what we have followed for this Medical report Analyzer project.
  • there are different steps involved in this process and get to know  which libraries are used with code snippets.


Extracting the Text From the Input PDF

step 1 :

We have extracted text from the pdf using the Python library Pytesseract 

installing the usefull libraries using pip command :-

  • !sudo apt-get install tesseract-ocr
    !pip install pytesseract


  • pip install PyMuPDF

step 2 :

now code to extract text from pdf



you will get the output text as :


Now there are two ways to Extract the specific field details from the extracted field :

  1.  Using the HardCode Method where you will Write code with many if-else condition to handle'different cases for different format of Medical Report you have to parse the text using regular expressions to extract the relevant field details likeName , Age , Gender, DOB , etc. 
     2. You Can use LLMs to extract the required field details 


We will see the 1st part we are able to extract the required field details but its very complicated to handle many cases on Input Medical report lets see the code part where we are able to extract the detail and save it in a Excel Sheet.









You can see the result in the EXCEL SHEET   :

Now the 2nd approach its better approach to handle several cases in Medical Report . 
we are trying to use the LLAMA-2/3  which is open-source LLM  to use in this  case but due to some memory issue the runntime is collapsing in the collab :-



Comments

Popular posts from this blog

Examples of running Machine Learning Model on Device using Qualcomm AI HUB

Running Inception_V3 using On-device AI