feat: add Astro DB#304
Conversation
|
The latest updates on your projects. Learn more about Vercel for Git ↗︎ 1 Skipped Deployment
|
| u.startsWith('https://vimeo.com/') | ||
| ) | ||
| resourceType = 4; | ||
| else if (u.indexOf('udemy.com/') > -1 || u.indexOf('course') > -1) resourceType = 3; |
Check failure
Code scanning / CodeQL
Incomplete URL substring sanitization
| ) | ||
| resourceType = 4; | ||
| else if (u.indexOf('udemy.com/') > -1 || u.indexOf('course') > -1) resourceType = 3; | ||
| else if (u.indexOf('amazon.com/') > -1 || u.indexOf('pdf') > -1 || u.indexOf('book') > -1) resourceType = 2; |
Check failure
Code scanning / CodeQL
Incomplete URL substring sanitization
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI over 1 year ago
To fix the problem, we need to parse the URL and check its host against a whitelist of allowed hosts. This ensures that the host is exactly what we expect and not just a substring within a potentially malicious URL.
- Use the
urlmodule to parse the URL and extract the host. - Define a whitelist of allowed hosts.
- Check if the parsed host is in the whitelist before categorizing the resource.
| @@ -1,2 +1,3 @@ | ||
| import * as fs from 'fs'; | ||
| import { URL } from 'url'; | ||
| import { Author, Like, LikeTest, NOW, Rating, RelationResourceTag, Resource, ResourceType, Tag, TagType, Taxonomy, TaxonomyType, User, Visits, db } from 'astro:db'; | ||
| @@ -305,15 +306,25 @@ | ||
| const u = r.url.toLowerCase(); | ||
| if ( | ||
| u.startsWith('https://youtube.com/') || | ||
| u.startsWith('http://youtube.com/') || | ||
| u.startsWith('https://www.youtube.com/') || | ||
| u.startsWith('http://www.youtube.com/') || | ||
| u.startsWith('https://youtu.be/') || | ||
| u.startsWith('https://vimeo.com/') | ||
| ) | ||
| resourceType = 4; | ||
| else if (u.indexOf('udemy.com/') > -1 || u.indexOf('course') > -1) resourceType = 3; | ||
| else if (u.indexOf('amazon.com/') > -1 || u.indexOf('pdf') > -1 || u.indexOf('book') > -1) resourceType = 2; | ||
| else if (u.indexOf('github.com/') > -1 || u.indexOf('gitlab.com/') > -1) resourceType = 5; | ||
| else if (u.indexOf('medium.com/') > -1 || u.indexOf('dev.to/') > -1 || u.indexOf('blog') > -1 || t.indexOf('blog') > -1 || u.indexOf('.pdf') > -1) resourceType = 6; | ||
| const parsedUrl = new URL(u); | ||
| const host = parsedUrl.host; | ||
| const allowedHosts = { | ||
| 'youtube.com': 4, | ||
| 'www.youtube.com': 4, | ||
| 'youtu.be': 4, | ||
| 'vimeo.com': 4, | ||
| 'udemy.com': 3, | ||
| 'amazon.com': 2, | ||
| 'github.com': 5, | ||
| 'gitlab.com': 5, | ||
| 'medium.com': 6, | ||
| 'dev.to': 6 | ||
| }; | ||
| if (allowedHosts[host]) { | ||
| resourceType = allowedHosts[host]; | ||
| } else if (u.indexOf('course') > -1) { | ||
| resourceType = 3; | ||
| } else if (u.indexOf('pdf') > -1 || u.indexOf('book') > -1) { | ||
| resourceType = 2; | ||
| } else if (u.indexOf('blog') > -1 || t.indexOf('blog') > -1 || u.indexOf('.pdf') > -1) { | ||
| resourceType = 6; | ||
| } | ||
| r.resource_type_id = !!row.resource_type_id ? row.resource_type_id : resourceType; |
| resourceType = 4; | ||
| else if (u.indexOf('udemy.com/') > -1 || u.indexOf('course') > -1) resourceType = 3; | ||
| else if (u.indexOf('amazon.com/') > -1 || u.indexOf('pdf') > -1 || u.indexOf('book') > -1) resourceType = 2; | ||
| else if (u.indexOf('github.com/') > -1 || u.indexOf('gitlab.com/') > -1) resourceType = 5; |
Check failure
Code scanning / CodeQL
Incomplete URL substring sanitization
| resourceType = 4; | ||
| else if (u.indexOf('udemy.com/') > -1 || u.indexOf('course') > -1) resourceType = 3; | ||
| else if (u.indexOf('amazon.com/') > -1 || u.indexOf('pdf') > -1 || u.indexOf('book') > -1) resourceType = 2; | ||
| else if (u.indexOf('github.com/') > -1 || u.indexOf('gitlab.com/') > -1) resourceType = 5; |
Check failure
Code scanning / CodeQL
Incomplete URL substring sanitization
| else if (u.indexOf('udemy.com/') > -1 || u.indexOf('course') > -1) resourceType = 3; | ||
| else if (u.indexOf('amazon.com/') > -1 || u.indexOf('pdf') > -1 || u.indexOf('book') > -1) resourceType = 2; | ||
| else if (u.indexOf('github.com/') > -1 || u.indexOf('gitlab.com/') > -1) resourceType = 5; | ||
| else if (u.indexOf('medium.com/') > -1 || u.indexOf('dev.to/') > -1 || u.indexOf('blog') > -1 || t.indexOf('blog') > -1 || u.indexOf('.pdf') > -1) resourceType = 6; |
Check failure
Code scanning / CodeQL
Incomplete URL substring sanitization
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI over 1 year ago
To fix the problem, we need to parse the URL and check the host component explicitly against a whitelist of allowed hosts. This approach ensures that the check is accurate and not prone to substring matching issues.
- Parse the URL using a URL parsing library to extract the host component.
- Define a whitelist of allowed hosts.
- Check if the parsed host is in the whitelist.
- Update the resource type assignment logic to use this secure check.
| @@ -304,16 +304,16 @@ | ||
| const t = r.title.toLowerCase(); | ||
| const u = r.url.toLowerCase(); | ||
| if ( | ||
| u.startsWith('https://youtube.com/') || | ||
| u.startsWith('http://youtube.com/') || | ||
| u.startsWith('https://www.youtube.com/') || | ||
| u.startsWith('http://www.youtube.com/') || | ||
| u.startsWith('https://youtu.be/') || | ||
| u.startsWith('https://vimeo.com/') | ||
| ) | ||
| resourceType = 4; | ||
| else if (u.indexOf('udemy.com/') > -1 || u.indexOf('course') > -1) resourceType = 3; | ||
| else if (u.indexOf('amazon.com/') > -1 || u.indexOf('pdf') > -1 || u.indexOf('book') > -1) resourceType = 2; | ||
| else if (u.indexOf('github.com/') > -1 || u.indexOf('gitlab.com/') > -1) resourceType = 5; | ||
| else if (u.indexOf('medium.com/') > -1 || u.indexOf('dev.to/') > -1 || u.indexOf('blog') > -1 || t.indexOf('blog') > -1 || u.indexOf('.pdf') > -1) resourceType = 6; | ||
| const u = new URL(r.url.toLowerCase()); | ||
| const host = u.host; | ||
| const allowedHosts = { | ||
| 4: ['youtube.com', 'www.youtube.com', 'youtu.be', 'vimeo.com'], | ||
| 3: ['udemy.com'], | ||
| 2: ['amazon.com'], | ||
| 5: ['github.com', 'gitlab.com'], | ||
| 6: ['medium.com', 'dev.to'] | ||
| }; | ||
| if (allowedHosts[4].includes(host)) resourceType = 4; | ||
| else if (allowedHosts[3].includes(host) || u.pathname.includes('course')) resourceType = 3; | ||
| else if (allowedHosts[2].includes(host) || u.pathname.includes('pdf') || u.pathname.includes('book')) resourceType = 2; | ||
| else if (allowedHosts[5].includes(host)) resourceType = 5; | ||
| else if (allowedHosts[6].includes(host) || u.pathname.includes('blog') || t.includes('blog') || u.pathname.endsWith('.pdf')) resourceType = 6; | ||
| r.resource_type_id = !!row.resource_type_id ? row.resource_type_id : resourceType; |
|
Your database schema is up-to-date. |
📚 Description
This PR adds Astro DB as a data source, both for local development (via CSV seeder files) and remote Studio connection.
This is still a draft until I manage to make the resource page functional, plus all the taxonomy pages. The rest will have to be optimized in further PRs.
Known issues include:
seed.ts- maybe I should use fetching techniques instead of the file systemI should fix them before merging this.
🔗 Linked issue(s)
Fixes #12 (Figure out initial database solution).
Fixes #73 (Refactor
getStaticPaths).Fixes #76 (Add all document links).
Fixes #100 (YouTube Embeds).
Fixes #137 (Refactor
ResourceListandResourceTOC).Fixes #159 (Add Supabase).
Fixes #160 (Test Supabase as initial storage solution).
Fixes #170 (Redirect table in Supabase).
Fixes #284 (Test AstroDB as initial storage solution).
❓ Type of change
📄 Changelog
pnpm upViewTransitionandAstro Server Islandsas a POC✅ Checklist