This website requires JavaScript.
47dd51af4a
Clean up folder structure, remove unused data, add images for frontend use (maybe - keeping options open
main
Prabhaav Pillai
2026-04-04 01:10:57 -04:00
6eaaa4c4e6
readme change to test gitea
Prabhaav Pillai
2026-04-03 20:55:19 -04:00
d4777b5e72
Updated page.tsx
Vadella, Anna
2026-04-01 13:02:29 -04:00
90a551c048
Merge branch 'main' of https://github.com/IshaAtteri/datamining_881 changes to embeddings for plot
IshaAtteri
2026-03-26 14:05:55 -04:00
ee358acf64
text embeddings for plot
IshaAtteri
2026-03-26 14:05:02 -04:00
3a912bf09e
Frontend changes
Vadella, Anna
2026-03-26 12:35:58 -04:00
24e0d2cc21
Ground Truth Spreadsheets and Code used to move all pictures into a folder
prabhaavp
2026-03-26 01:18:43 -04:00
496761ca78
director and cast preprocessing
IshaAtteri
2026-03-25 18:21:16 -04:00
233fa3df17
preprocessing changes
IshaAtteri
2026-03-25 18:14:03 -04:00
db20497deb
Merge branch 'main' of https://github.com/IshaAtteri/datamining_881
prabhaavp
2026-03-19 13:03:28 -04:00
a06d3c8963
Spreadsheets folder
prabhaavp
2026-03-19 13:03:13 -04:00
41eaba161b
structural changes
IshaAtteri
2026-03-19 12:49:10 -04:00
c5d1ff3ab4
some changes
IshaAtteri
2026-03-19 12:45:32 -04:00
db645f3bbe
changes
IshaAtteri
2026-03-19 12:32:56 -04:00
492160c3a3
Revisions to Zim parsing, netflix parsing, and updates to html scraping to include synopsis
prabhaavp
2026-03-19 01:56:14 -04:00
0a70920ba9
Merge branch 'main' of https://github.com/IshaAtteri/datamining_881
prabhaavp
2026-03-19 01:55:21 -04:00
a8b7a6d20c
Revisions to Zim parsing, netflix parsing, and updates to html scraping to include synopsis
prabhaavp
2026-03-19 01:54:55 -04:00
300e5dec16
Switch to using Webpack instead of Turbopack
Vadella, Anna
2026-03-18 16:51:56 -04:00
08e5739638
Create base for frontend interface
Vadella, Anna
2026-03-18 16:21:51 -04:00
279fe399ed
Minor Cleanup of files. Moved to unused folder.
prabhaavp
2026-03-17 01:24:09 -04:00
52f8d8faef
the spreadsheet
prabhaavp
2026-03-12 14:21:13 -04:00
2638de1191
The code to extract zim into a spreadsheet.
prabhaavp
2026-03-12 14:19:40 -04:00
a435592f75
Merge branch 'main' of https://github.com/IshaAtteri/datamining_881 into isha
isha
IshaAtteri
2026-03-12 12:41:15 -04:00
437492e623
small changes
IshaAtteri
2026-03-12 12:16:51 -04:00
525e359c6b
- Html -> TSV
prabhaavp
2026-03-12 12:14:31 -04:00
a1beba6730
beatifulsoup extract code
IshaAtteri
2026-03-12 12:11:37 -04:00
1614d85270
- Fixed Bug: Certain characters can't be used for folder names. Need to fix it so those characters are removed. There is now a sanitize_slug function used
prabhaavp
2026-03-10 14:45:45 -04:00
cfbddf2a24
- Updates to make it name the folder the name of the wikipedia slug. Fix needed: Certain characters can't be used for folder names. Need to fix it so those characters are removed.
prabhaavp
2026-03-10 14:15:33 -04:00
8fa2cdba3c
preprocessing script
IshaAtteri
2026-03-10 14:14:59 -04:00
2ec6f8c28a
testing extract_wiki_zim.py
Vadella, Anna
2026-03-10 13:29:56 -04:00
36af063777
- Delete the folders if we skipped a movie due to not being found
prabhaavp
2026-03-10 13:17:21 -04:00
0ac1234afa
- Fix directories
prabhaavp
2026-03-10 13:10:25 -04:00
401e7e5497
- Extract info needed from ZIM file
prabhaavp
2026-02-12 20:07:09 -05:00
9412c834f1
Merge pull request #2 from IshaAtteri/isha
IshaAtteri
2026-02-11 17:56:24 -05:00
cb2fcd19eb
structure change
IshaAtteri
2026-02-11 17:55:24 -05:00
ed2e20f8cd
Merge pull request #1 from IshaAtteri/isha has the code
IshaAtteri
2026-02-11 17:54:04 -05:00
0cc571727b
wikipedia movie scraping using api code
IshaAtteri
2026-02-11 17:51:38 -05:00
30dbfe0dcc
code for job system stuff
IshaAtteri
2026-02-11 17:40:59 -05:00
369f5ced89
Update README.md
prabhaavp
2026-02-03 22:25:28 -05:00
2d2ee64c0e
- Added venv instruction + requirements.txt - Added data folder structure with .gitkeep - Added .gitignore - Added load.py to load IMDB dataset and preview with D-Tale
prabhaavp
2026-02-03 22:21:41 -05:00
c18b412867
Initial commit
IshaAtteri
2026-01-27 12:39:22 -05:00