datamining_881

Author	SHA1	Message	Date
prabhaavp	db20497deb	Merge branch 'main' of https://github.com/IshaAtteri/datamining_881	2026-03-19 13:03:28 -04:00
prabhaavp	a06d3c8963	Spreadsheets folder	2026-03-19 13:03:13 -04:00
IshaAtteri	41eaba161b	structural changes	2026-03-19 12:49:10 -04:00
IshaAtteri	c5d1ff3ab4	some changes	2026-03-19 12:45:32 -04:00
IshaAtteri	db645f3bbe	changes	2026-03-19 12:32:56 -04:00
prabhaavp	492160c3a3	Revisions to Zim parsing, netflix parsing, and updates to html scraping to include synopsis	2026-03-19 01:56:14 -04:00
prabhaavp	0a70920ba9	Merge branch 'main' of https://github.com/IshaAtteri/datamining_881	2026-03-19 01:55:21 -04:00
prabhaavp	a8b7a6d20c	Revisions to Zim parsing, netflix parsing, and updates to html scraping to include synopsis	2026-03-19 01:54:55 -04:00
Vadella, Anna	300e5dec16	Switch to using Webpack instead of Turbopack Using Next.js 16.2.0 with Tailwind v4 and Turbopack currently has a known issue: Turbopack tries to resolve Tailwind from the parent directory instead of project folder	2026-03-18 16:51:56 -04:00
Vadella, Anna	08e5739638	Create base for frontend interface	2026-03-18 16:21:51 -04:00
prabhaavp	279fe399ed	Minor Cleanup of files. Moved to unused folder.	2026-03-17 01:24:09 -04:00
prabhaavp	52f8d8faef	the spreadsheet	2026-03-12 14:21:13 -04:00
prabhaavp	2638de1191	The code to extract zim into a spreadsheet.	2026-03-12 14:19:40 -04:00
IshaAtteri	a435592f75	Merge branch 'main' of https://github.com/IshaAtteri/datamining_881 into isha	2026-03-12 12:41:15 -04:00
IshaAtteri	437492e623	small changes	2026-03-12 12:16:51 -04:00
prabhaavp	525e359c6b	- Html -> TSV	2026-03-12 12:14:31 -04:00
IshaAtteri	a1beba6730	beatifulsoup extract code	2026-03-12 12:11:37 -04:00
prabhaavp	1614d85270	- Fixed Bug: Certain characters can't be used for folder names. Need to fix it so those characters are removed. There is now a sanitize_slug function used	2026-03-10 14:45:45 -04:00
prabhaavp	cfbddf2a24	- Updates to make it name the folder the name of the wikipedia slug. Fix needed: Certain characters can't be used for folder names. Need to fix it so those characters are removed.	2026-03-10 14:15:33 -04:00
IshaAtteri	8fa2cdba3c	preprocessing script	2026-03-10 14:14:59 -04:00
Vadella, Anna	2ec6f8c28a	testing extract_wiki_zim.py	2026-03-10 13:29:56 -04:00
prabhaavp	36af063777	- Delete the folders if we skipped a movie due to not being found	2026-03-10 13:17:21 -04:00
prabhaavp	0ac1234afa	- Fix directories	2026-03-10 13:10:25 -04:00
prabhaavp	401e7e5497	- Extract info needed from ZIM file	2026-02-12 20:07:09 -05:00
IshaAtteri	9412c834f1	Merge pull request #2 from IshaAtteri/isha structure change	2026-02-11 17:56:24 -05:00
IshaAtteri	cb2fcd19eb	structure change	2026-02-11 17:55:24 -05:00
IshaAtteri	ed2e20f8cd	Merge pull request #1 from IshaAtteri/isha has the code Isha	2026-02-11 17:54:04 -05:00
IshaAtteri	0cc571727b	wikipedia movie scraping using api code	2026-02-11 17:51:38 -05:00
IshaAtteri	30dbfe0dcc	code for job system stuff	2026-02-11 17:40:59 -05:00
prabhaavp	369f5ced89	Update README.md Updated readme to include structure picture	2026-02-03 22:25:28 -05:00
prabhaavp	2d2ee64c0e	- Added venv instruction + requirements.txt - Added data folder structure with .gitkeep - Added .gitignore - Added load.py to load IMDB dataset and preview with D-Tale	2026-02-03 22:21:41 -05:00
IshaAtteri	c18b412867	Initial commit	2026-01-27 12:39:22 -05:00

32 Commits