setup | Notion

Let me help you understand where we are with the database creation and provide a clear documentation of our progress.

First, let’s check if we have actually created a database:

# Check if database existssqlite3 D:/Projects/LegalTextAnalysis/data/legal_text.db ".databases"# List any tables if they existsqlite3 D:/Projects/LegalTextAnalysis/data/legal_text.db ".tables"

You’re right - we need to first create the database before defining its structure. Let me create a markdown file documenting our setup process:

# Legal Text Analysis Database Setup*Project Documentation - Database Structure and Setup*
## Initial Database CreationBefore creating tables, we must first create our SQLite database:
```sql-- Create new SQLite databasesqlite3 D:/Projects/LegalTextAnalysis/data/legal_text.db

Database Structure

Our database is designed to support legal text analysis with three main tables:

1. Primary Cases Table (legal_cases)

Stores the raw case data and basic metrics:

CREATE TABLE legal_cases (
    case_id TEXT PRIMARY KEY,
    case_outcome TEXT,
    case_title TEXT,
    case_text TEXT,
    text_length INTEGER,
    word_count INTEGER,
    citation_count INTEGER,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP);

2. Text Analysis Metrics Table (text_metrics)

Stores advanced text analysis measurements:

CREATE TABLE text_metrics (
    case_id TEXT PRIMARY KEY,
    sentence_count INTEGER,
    citation_density REAL,
    sentence_complexity REAL,
    length_category TEXT,
    FOREIGN KEY (case_id) REFERENCES legal_cases(case_id)
);

3. TF-IDF Scores Table (tfidf_scores)

Stores term frequency-inverse document frequency analysis results:

CREATE TABLE tfidf_scores (
    case_id TEXT,
    term TEXT,
    score REAL,
    term_category TEXT,  -- 'high' or 'low' significance    PRIMARY KEY (case_id, term),
    FOREIGN KEY (case_id) REFERENCES legal_cases(case_id)
);

Performance Optimization

Indices for improved query performance:

CREATE INDEX idx_case_outcome ON legal_cases(case_outcome);
CREATE INDEX idx_text_length ON legal_cases(text_length);
CREATE INDEX idx_term_scores ON tfidf_scores(score);

Database Structure

1. Primary Cases Table (legal_cases)

2. Text Analysis Metrics Table (text_metrics)

3. TF-IDF Scores Table (tfidf_scores)

Performance Optimization

Useful SQLite Commands