Commit Graph

  • 47dd51af4a Clean up folder structure, remove unused data, add images for frontend use (maybe - keeping options open main Prabhaav Pillai 2026-04-04 01:10:57 -04:00
  • 6eaaa4c4e6 readme change to test gitea Prabhaav Pillai 2026-04-03 20:55:19 -04:00
  • d4777b5e72 Updated page.tsx Vadella, Anna 2026-04-01 13:02:29 -04:00
  • 90a551c048 Merge branch 'main' of https://github.com/IshaAtteri/datamining_881 changes to embeddings for plot IshaAtteri 2026-03-26 14:05:55 -04:00
  • ee358acf64 text embeddings for plot IshaAtteri 2026-03-26 14:05:02 -04:00
  • 3a912bf09e Frontend changes Vadella, Anna 2026-03-26 12:35:58 -04:00
  • 24e0d2cc21 Ground Truth Spreadsheets and Code used to move all pictures into a folder prabhaavp 2026-03-26 01:18:43 -04:00
  • 496761ca78 director and cast preprocessing IshaAtteri 2026-03-25 18:21:16 -04:00
  • 233fa3df17 preprocessing changes IshaAtteri 2026-03-25 18:14:03 -04:00
  • db20497deb Merge branch 'main' of https://github.com/IshaAtteri/datamining_881 prabhaavp 2026-03-19 13:03:28 -04:00
  • a06d3c8963 Spreadsheets folder prabhaavp 2026-03-19 13:03:13 -04:00
  • 41eaba161b structural changes IshaAtteri 2026-03-19 12:49:10 -04:00
  • c5d1ff3ab4 some changes IshaAtteri 2026-03-19 12:45:32 -04:00
  • db645f3bbe changes IshaAtteri 2026-03-19 12:32:56 -04:00
  • 492160c3a3 Revisions to Zim parsing, netflix parsing, and updates to html scraping to include synopsis prabhaavp 2026-03-19 01:56:14 -04:00
  • 0a70920ba9 Merge branch 'main' of https://github.com/IshaAtteri/datamining_881 prabhaavp 2026-03-19 01:55:21 -04:00
  • a8b7a6d20c Revisions to Zim parsing, netflix parsing, and updates to html scraping to include synopsis prabhaavp 2026-03-19 01:54:55 -04:00
  • 300e5dec16 Switch to using Webpack instead of Turbopack Vadella, Anna 2026-03-18 16:51:56 -04:00
  • 08e5739638 Create base for frontend interface Vadella, Anna 2026-03-18 16:21:51 -04:00
  • 279fe399ed Minor Cleanup of files. Moved to unused folder. prabhaavp 2026-03-17 01:24:09 -04:00
  • 52f8d8faef the spreadsheet prabhaavp 2026-03-12 14:21:13 -04:00
  • 2638de1191 The code to extract zim into a spreadsheet. prabhaavp 2026-03-12 14:19:40 -04:00
  • a435592f75 Merge branch 'main' of https://github.com/IshaAtteri/datamining_881 into isha isha IshaAtteri 2026-03-12 12:41:15 -04:00
  • 437492e623 small changes IshaAtteri 2026-03-12 12:16:51 -04:00
  • 525e359c6b - Html -> TSV prabhaavp 2026-03-12 12:14:31 -04:00
  • a1beba6730 beatifulsoup extract code IshaAtteri 2026-03-12 12:11:37 -04:00
  • 1614d85270 - Fixed Bug: Certain characters can't be used for folder names. Need to fix it so those characters are removed. There is now a sanitize_slug function used prabhaavp 2026-03-10 14:45:45 -04:00
  • cfbddf2a24 - Updates to make it name the folder the name of the wikipedia slug. Fix needed: Certain characters can't be used for folder names. Need to fix it so those characters are removed. prabhaavp 2026-03-10 14:15:33 -04:00
  • 8fa2cdba3c preprocessing script IshaAtteri 2026-03-10 14:14:59 -04:00
  • 2ec6f8c28a testing extract_wiki_zim.py Vadella, Anna 2026-03-10 13:29:56 -04:00
  • 36af063777 - Delete the folders if we skipped a movie due to not being found prabhaavp 2026-03-10 13:17:21 -04:00
  • 0ac1234afa - Fix directories prabhaavp 2026-03-10 13:10:25 -04:00
  • 401e7e5497 - Extract info needed from ZIM file prabhaavp 2026-02-12 20:07:09 -05:00
  • 9412c834f1 Merge pull request #2 from IshaAtteri/isha IshaAtteri 2026-02-11 17:56:24 -05:00
  • cb2fcd19eb structure change IshaAtteri 2026-02-11 17:55:24 -05:00
  • ed2e20f8cd Merge pull request #1 from IshaAtteri/isha has the code IshaAtteri 2026-02-11 17:54:04 -05:00
  • 0cc571727b wikipedia movie scraping using api code IshaAtteri 2026-02-11 17:51:38 -05:00
  • 30dbfe0dcc code for job system stuff IshaAtteri 2026-02-11 17:40:59 -05:00
  • 369f5ced89 Update README.md prabhaavp 2026-02-03 22:25:28 -05:00
  • 2d2ee64c0e - Added venv instruction + requirements.txt - Added data folder structure with .gitkeep - Added .gitignore - Added load.py to load IMDB dataset and preview with D-Tale prabhaavp 2026-02-03 22:21:41 -05:00
  • c18b412867 Initial commit IshaAtteri 2026-01-27 12:39:22 -05:00