Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions .claude/settings.local.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
{
"permissions": {
"allow": [
"Bash(git add:*)",
"Bash(git commit:*)",
"Bash(git push:*)",
"Bash(curl:*)"
],
"deny": []
}
}
107 changes: 44 additions & 63 deletions .github/workflows/database-dump.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,10 +14,40 @@ jobs:
actions: write

steps:
- name: Set up PostgreSQL client
- name: Cache PostgreSQL client
id: cache-postgresql
uses: actions/cache@v3
with:
path: |
/usr/lib/postgresql/17
/usr/share/postgresql/17
/usr/bin/pg_dump
/usr/bin/pg_restore
/usr/bin/psql
key: ${{ runner.os }}-postgresql-client-17-${{ hashFiles('.github/workflows/database-dump.yml') }}

- name: Set up PostgreSQL 17 client
if: steps.cache-postgresql.outputs.cache-hit != 'true'
run: |
# Add PostgreSQL official APT repository using the new method
sudo apt-get update
sudo apt-get install -y postgresql-client
sudo apt-get install -y wget ca-certificates gnupg

# Create keyrings directory if it doesn't exist
sudo mkdir -p /usr/share/keyrings

# Download and add the signing key with proper conversion
wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | \
gpg --dearmor | \
sudo tee /usr/share/keyrings/postgresql-archive-keyring.gpg > /dev/null

# Add the repository with signed-by option
echo "deb [signed-by=/usr/share/keyrings/postgresql-archive-keyring.gpg] http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" | \
sudo tee /etc/apt/sources.list.d/pgdg.list

# Update and install PostgreSQL 17 client
sudo apt-get update
sudo apt-get install -y postgresql-client-17

- name: Create dump directory
run: mkdir -p database_dumps
Expand All @@ -33,75 +63,26 @@ jobs:
export PGUSER=$(echo $DATABASE_URL | sed -E 's/postgres:\/\/([^:]+):.*/\1/')
export PGPASSWORD=$(echo $DATABASE_URL | sed -E 's/postgres:\/\/[^:]+:([^@]+)@.*/\1/')

# Create dump excluding users table
DUMP_FILE="database_dumps/db_dump_$(date +%Y%m%d_%H%M%S).sql"
# Create dump excluding users table in custom archive format
DUMP_FILE="database_dumps/db_dump_$(date +%Y%m%d_%H%M%S).dump"

# Dump schema and data, excluding the users table
# Dump schema and data in custom format, excluding the users table
pg_dump --no-owner --no-privileges \
--format=custom \
--exclude-table=users \
--exclude-table=schema_migrations \
--exclude-table=ar_internal_metadata \
-f "$DUMP_FILE"

# Compress the dump
gzip "$DUMP_FILE"
echo "DUMP_FILE=${DUMP_FILE}.gz" >> $GITHUB_ENV
echo "Dump created: ${DUMP_FILE}.gz"
echo "DUMP_FILE=${DUMP_FILE}" >> $GITHUB_ENV
echo "Dump created: ${DUMP_FILE}"

- name: Upload dump as artifact using GitHub API
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
# Get the dump file name
DUMP_FILE_NAME=$(basename "$DUMP_FILE")

# Create a unique artifact name with timestamp
ARTIFACT_NAME="database-dump-$(date +%Y%m%d-%H%M%S)"

# Get workflow run ID
RUN_ID="${{ github.run_id }}"

# Create artifact upload
echo "Creating artifact upload..."
UPLOAD_RESPONSE=$(curl -L \
-X POST \
-H "Accept: application/vnd.github+json" \
-H "Authorization: Bearer $GITHUB_TOKEN" \
-H "X-GitHub-Api-Version: 2022-11-28" \
"https://api.github.com/repos/${{ github.repository }}/actions/runs/${RUN_ID}/artifacts" \
-d "{\"name\":\"${ARTIFACT_NAME}\", \"retention_days\": 30}")

# Extract upload URL and other details
UPLOAD_URL=$(echo "$UPLOAD_RESPONSE" | jq -r '.upload_url')
ARTIFACT_ID=$(echo "$UPLOAD_RESPONSE" | jq -r '.id')

if [ "$UPLOAD_URL" = "null" ] || [ -z "$UPLOAD_URL" ]; then
echo "Failed to create artifact upload"
echo "Response: $UPLOAD_RESPONSE"
exit 1
fi

# Upload the file
echo "Uploading dump file..."
curl -L \
-X PUT \
-H "Accept: application/vnd.github+json" \
-H "Authorization: Bearer $GITHUB_TOKEN" \
-H "X-GitHub-Api-Version: 2022-11-28" \
-H "Content-Type: application/gzip" \
--data-binary "@$DUMP_FILE" \
"$UPLOAD_URL"

# Finalize the artifact
echo "Finalizing artifact..."
curl -L \
-X POST \
-H "Accept: application/vnd.github+json" \
-H "Authorization: Bearer $GITHUB_TOKEN" \
-H "X-GitHub-Api-Version: 2022-11-28" \
"https://api.github.com/repos/${{ github.repository }}/actions/artifacts/${ARTIFACT_ID}/finalize"

echo "Database dump uploaded as artifact: ${ARTIFACT_NAME}"
- name: Upload dump as artifact
uses: actions/upload-artifact@v4
with:
name: database-dump-${{ github.run_number }}-${{ github.run_attempt }}
path: ${{ env.DUMP_FILE }}
retention-days: 30

- name: Clean up old artifacts
env:
Expand Down
40 changes: 36 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,12 +38,12 @@
These are extracted using an LLM from the Entry's raw data. Each entry might have multiple activities.

`Evidence`:
Evidence links an Activity to a Promise. They are linked using an LLM.
Evidence links an Activity to a Promise. They are linked using an LLM.



### 🛠 Setup
Ensure you have Ruby and PostgresQL installed
Ensure you have Ruby, PostgreSQL and the Github CLI installed

```bash
# Install dependencies
Expand All @@ -55,9 +55,41 @@ sudo service postgresql start

# Setup database
rails db:create
rails db:migrate
rails db:seed
rake db:fetch_and_restore

# Run the server
rails s
```

### 🚀 Developer Onboarding

For new developers joining the project, we provide a streamlined onboarding process using production database dumps:

#### Quick Start with Production Data

1. **Prerequisites**:
- Install the GitHub CLI: https://cli.github.com/
- Authenticate with: `gh auth login`

2. **Restore from Latest Database Dump**:
```bash
# List available database dumps
rake db:list_dumps

# Fetch and restore the latest production database dump
# This will download the most recent weekly backup and restore it locally
rake db:fetch_and_restore
```

3. **What's Included**: The database dump includes all production data except:
- User accounts (for privacy/security)
- Schema migrations metadata
- Internal Rails metadata

4. **Post-Restore**: After restoring, the rake task automatically runs any pending migrations

#### Database Dumps Schedule

- Production database is automatically dumped weekly (every Monday at 2 AM UTC)
- Dumps are stored as GitHub Actions artifacts for 30 days
- Dumps use PostgreSQL's custom archive format for efficient storage and restore
Loading
Loading