We built a web app that turns unstructured medical documents (PDFs/images) into actionable insights. Users upload a file, we extract text (pdf-parse/tesseract.js), analyze it with a structured LLaMA prompt, store files and metadata in Supabase (Storage + Postgres), and present summaries, categories, abnormalities, and references in a clean UI. There’s also a Supabase Edge Function for server-side/batch processing.We built a web app that turns unstructured medical documents (PDFs/images) into actionable insights. Users upload a file, we extract text (pdf-parse/tesseract.js), analyze it with a structured LLaMA prompt, store files and metadata in Supabase (Storage + Postgres), and present summaries, categories, abnormalities, and references in a clean UI. There’s also a Supabase Edge Function for server-side/batch processing.

We Built an AI Medical Analyst in a Weekend at the Caltech Longevity Hackathon

2025/10/15 05:53
7분 읽기
이 콘텐츠에 대한 의견이나 우려 사항이 있으시면 crypto.news@mexc.com으로 연락주시기 바랍니다

\

Why longevity?

  • Longevity care is longitudinal and data-heavy. Patients accumulate lab panels, imaging, and clinical notes over years.
  • Clinicians and individuals need fast, explainable triage: What’s abnormal? What changed? What should I read next?
  • A generalizable pipeline that works “in a weekend” helps teams experiment faster and validate real-world impact.

What we shipped at the hackathon

  • A Next.js/React UI with a dead-simple uploader and a results table.
  • Client-side text extraction for PDFs and images to support mixed inputs quickly.
  • A structured LLaMA prompt that returns summary, keywords, categories, abnormal flags, suggested filename, and PubMed titles.
  • Supabase Storage for raw files; Postgres table documents for structured metadata.
  • A Supabase Edge Function to process stored PDFs server-side (useful for background jobs and batch workflows).

Architecture

  • Upload → Extract text → LLM analysis → Persist → Render
  • Two processing paths:
  • Client-led: immediate feedback, great for demos and small files.
  • Server-led (Edge Function): scalable, secure, and good for background/batch processing.

Key moving parts:

  • Frontend: Next.js + Tailwind
  • OCR/Parsing: pdf-parse, tesseract.js
  • AI: LLaMA chat completions API with a rigid, parse-friendly prompt
  • Backend: Supabase (Storage for blobs, Postgres for metadata)
  • Serverless: Supabase Edge Function for server-side PDF processing

Product walk-through

Home page

            <h1 className="text-3xl font-bold leading-tight text-gray-900">               Medical Document Analysis             </h1>           </div>         </header>         <main>           <div className="max-w-7xl mx-auto sm:px-6 lg:px-8">             <MedicalDocUploader /> 
  • Clean landing with a single CTA: upload a document.

Upload → extract → analyze → persist (client path)

  • Extracts text conditionally based on file type.
  • Calls LLaMA with a structured prompt to ensure predictable parsing.
  • Uploads original file to Supabase Storage and inserts metadata into documents.
  const handleFileUpload = useCallback(async (event: React.ChangeEvent<HTMLInputElement>) => {     try {       setIsUploading(true);       setError(null);        const file = event.target.files?.[0];       if (!file) return;        // Extract text based on file type       const text = file.type === 'application/pdf'          ? await extractTextFromPDF(file)         : await extractTextFromImage(file);        // Analyze the text with Llama       const analysis = await analyzeWithLlama(text, file.name);        // Upload to Supabase Storage       const {  uploadData, error: uploadError } = await supabase.storage         .from('medical-documents')         .upload(analysis.renamed_file, file);        if (uploadError) throw uploadError;        // Store metadata in Supabase       const {  metaData, error: metaError } = await supabase         .from('documents')         .insert({           filename: file.name,           renamed_file: analysis.renamed_file,           file_url: uploadData.path,           summary: analysis.summary,           keywords: analysis.keywords,           categories: analysis.categories,           word_count: countWords(text),           report_type: detectReportType(text),           threshold_flags: analysis.threshold_flags,           pubmed_refs: analysis.pubmed_refs,           ai_notes: analysis.ai_notes,           status: 'processed',           version: 1         })         .select()         .single();        if (metaError) throw metaError;        setDocuments(prev => [...prev, metaData]);     } catch (err) {       setError(err instanceof Error ? err.message : 'An error occurred');     } finally {       setIsUploading(false);     }   }, []); 

The prompt that makes it reliable

  • The LLM is only as useful as its prompt structure. We force a schema, so parsing is straightforward and less brittle than free-form responses.
  const analyzeWithLlama = async (text: string, originalFilename: string) => {     const prompt = `Analyze this medical document and provide a detailed analysis in the following format:  1. Summary: Provide a clear, plain-English summary 2. Keywords: Extract key medical terms and their values (if any) 3. Categories: Classify into these categories: ${VALID_CATEGORIES.join(", ")} 4. Filename: Suggest a clear, descriptive filename 5. Threshold Flags: Identify any abnormal values and mark as "high", "low", or "normal" 6. PubMed References: Suggest relevant PubMed articles (just article titles) 7. Additional Notes: Any important medical guidance or observations  Document text: ${text}  Please format your response exactly as follows: Summary: [summary] Keywords: [key:value pairs] Categories: [categories] Filename: [filename] Flags: [abnormal values] References: [article titles] Notes: [additional guidance]`; 

Server-side processing (Edge Function)

  • Useful for background jobs, webhook-driven processing, or scaling beyond client limits.
  • Downloads file from Supabase Storage, extracts text, calls the same LLaMA prompt, inserts documents row.
    // Prepare LLaMA prompt     const prompt = `Analyze this medical document and provide a detailed analysis in the following format:  1. Summary: Provide a clear, plain-English summary 2. Keywords: Extract key medical terms and their values (if any) 3. Categories: Classify into these categories: ${VALID_CATEGORIES.join(", ")} 4. Filename: Suggest a clear, descriptive filename 5. Threshold Flags: Identify any abnormal values and mark as "high", "low", or "normal" 6. PubMed References: Suggest relevant PubMed articles (just article titles) 7. Additional Notes: Any important medical guidance or observations  Document text: ${text}  Please format your response exactly as follows: Summary: [summary] Keywords: [key:value pairs] Categories: [categories] Filename: [filename] Flags: [abnormal values] References: [article titles] Notes: [additional guidance]` 
    // Insert into Supabase     const {  insertData, error: insertError } = await supabase       .from('documents')       .insert(documentData)       .select()       .single() 

Implementation details

Database schema (Supabase Postgres)

Use JSONB where the structure can vary or expand over time.

-- documents table create table if not exists public.documents (   id bigint generated always as identity primary key,   created_at timestamp with time zone default now() not null,   user_id uuid null,   filename text not null,   renamed_file text not null,   file_url text not null,   summary text not null,   keywords jsonb not null default '{}'::jsonb,   categories text[] not null default '{}',   word_count integer not null,   report_type text not null,   threshold_flags jsonb not null default '{}'::jsonb,   pubmed_refs jsonb not null default '{}'::jsonb,   ai_notes text not null default '',   status text not null check (status in ('uploaded','processed','failed')),   user_notes text null,   version integer not null default 1 );  -- Optional: RLS policies for multi-tenant setups alter table public.documents enable row level security;  -- Example policies (tune for your auth model) create policy "Allow read to authenticated users"   on public.documents for select   to authenticated   using (true);  create policy "Insert own documents"   on public.documents for insert   to authenticated   with check (auth.uid() = user_id);  create policy "Update own documents"   on public.documents for update   to authenticated   using (auth.uid() = user_id); 

Storage bucket

  • Create medical-documents bucket.
  • Lock down access if you’re storing PHI; consider signed URLs and RLS on the storage.objects table.

Environment configuration

  • Never hardcode secrets client-side. Use environment variables and server-side access.
  • For local dev, rely on .env.local and do not commit it.

Example variables to configure:

  • SUPABASE_URL
  • SUPABASEANONKEY (client reads ok if your RLS is correct)
  • SUPABASESERVICEROLE_KEY (server-only; never expose to browser)
  • LLAMAAPIKEY (server-only)

For the Edge Function, set:

  • SUPABASE_URL
  • SUPABASE_ANON_KEY (or service role if needed)
  • LLAMA_API_KEY

Deploying the Edge Function

  • Install the Supabase CLI
  • Link your project
  • Deploy function
supabase functions deploy process-medical-pdf supabase functions list supabase functions serve process-medical-pdf --no-verify-jwt 

Wire the function behind an HTTP trigger or call it from your app to process files already stored in the bucket.

UX considerations

  • Drag-and-drop uploader with clear accept types.
  • Progress and error state visibility.
  • Terse, readable summaries with expandable details.
  • Badges for categories and flags for abnormalities.

Reliability strategies

  • Structured prompt → predictable parsing.
  • Keep LLM temperature moderate (0.3–0.7) to reduce variance.
  • Validate parsed JSON fields; default to safe fallbacks.
  • Track version and status to support re-processing and migrations.

Security and compliance

  • Treat all uploads as potentially sensitive (PHI).
  • Don’t expose secrets in the browser. Move LLaMA calls server-side if needed.
  • Consider de-identification or redaction at upload.
  • Encrypt at rest (Supabase handles storage encryption), and use HTTPS for all calls.
  • RLS across documents and signed URLs for downloads.

Performance and cost

  • OCR (tesseract.js) can be CPU-heavy; pre-processing images helps (deskew, denoise, contrast).
  • Use server-side processing for large PDFs or batch jobs.
  • Cache repeated LLM calls when re-processing the same file or version.

What we’d build next

  • Normalized lab values with medical ontologies (e.g., LOINC) and unit conversions.
  • Trend analysis across time and change detection.
  • Confidence scoring and a reviewer checklist for clinical safety.
  • Human-in-the-loop editing with audit trails.
  • Export to FHIR-compatible bundles.

Demo script (5 minutes)

  • Upload a lab report PDF.
  • Show immediate “Processing document…” state.
  • Reveal results: summary, categories, abnormal flags, and PubMed suggestions.
  • Click through the stored file link (signed URL if private).
  • Open Supabase Studio to show the corresponding documents row.

Credits

  • Built at the Caltech Longevity Hackathon by our team in a sprint focused on turning complex medical paperwork into fast, explainable insights.

\ \ \

시장 기회
플러리싱 에이아이 로고
플러리싱 에이아이 가격(SLEEPLESSAI)
$0.02236
$0.02236$0.02236
+1.26%
USD
플러리싱 에이아이 (SLEEPLESSAI) 실시간 가격 차트
면책 조항: 본 사이트에 재게시된 글들은 공개 플랫폼에서 가져온 것으로 정보 제공 목적으로만 제공됩니다. 이는 반드시 MEXC의 견해를 반영하는 것은 아닙니다. 모든 권리는 원저자에게 있습니다. 제3자의 권리를 침해하는 콘텐츠가 있다고 판단될 경우, crypto.news@mexc.com으로 연락하여 삭제 요청을 해주시기 바랍니다. MEXC는 콘텐츠의 정확성, 완전성 또는 시의적절성에 대해 어떠한 보증도 하지 않으며, 제공된 정보에 기반하여 취해진 어떠한 조치에 대해서도 책임을 지지 않습니다. 본 콘텐츠는 금융, 법률 또는 기타 전문적인 조언을 구성하지 않으며, MEXC의 추천이나 보증으로 간주되어서는 안 됩니다.

USD1 Genesis: 0 Fees + 12% APR

USD1 Genesis: 0 Fees + 12% APRUSD1 Genesis: 0 Fees + 12% APR

New users: stake for up to 600% APR. Limited time!