Objective: Web scraping with Python
Grading Procedures: All submission will be checked with a plagiarism software. Submission having more than 70% similarity to any other student submission and/or internet resources will share total points the assignment. For example, 4 submissions having more than 70% similarity will be graded as 100/4 = 25pts, assuming that the program is worth of 100 pts.
Description: The university maintains course schedules at http://appsprod.tamuc.edu/Schedule/Schedule.aspx for different semesters (spring, fall, winter, etc ). You will develop a Python program to dynamically complete certain tasks, such as list, find, sort, and save, in course listings from schedule portal. You will mainly use “request” and “BeautifulSoup” libraries (or similar, see exercise 12.1). The program will operate at different level: Semester and Department. Your program will be a menu based application. Assume that you project file is myproject.py. Once you run, it will show last 5 semester (fall, spring, summer only, (not winter, may mini))
> python myproject.py
Choose a semester: 1) Sprint 2021 2)Fall 2020 3)Summer II 4)Summer I 5)Spring 2020
Selection: 2
Here, your program will parse the data from website and show only last (most recent) 5 semesters. User will make selection, then, you will show departments for the selected semester (Fall 2020). Note that selected semester is visible before a “>” sign.
Fall 2020> Select a department:
1) Undeclared
2) Accounting and Finance
3) Art
4) Ag Science & Natural Resources
…
…
30) Social Work
31) Theatre
Q)Go back
Selection: 3
Fall 2020> Art > Select an option:
1) List courses by instruction name
2) List courses by capacity
3) List courses by enrollment size
4) List courses by course prefix
5) Save courses in a csv file
6) Search course by instruction name
7) Search courses by course prefix
Q)Go back
Selection: ??
Here, your program will parse the data from website and show all available department then list of tasks. Q (go back) option will take user to previous level.
Course listing output should show the following fields. For instance for course listing for “Fall 2020> Computer Science & Info Sys> List the course by prefix ” should show
PrefixID Sec Name Instructor Hours Seats Enroll.
COSC1301 01 WIntro to CompuLee, Kwang 3 35 10
COSC1436 01 EIntro to Comp Sci & ProgBrown, Thomas 4 40 36
COSC1436 01L Intro to Comp Sci & ProgBrown, Thomas 40 36
COSC1436 01W Intro to Comp Sci & ProgHu, Kaoning 4 45 43
COSC1436 02E Intro to Comp Sci & ProgHu, Kaoning 4 35 32
as first 5 rows.
You will follow above headers and order (prefix (col. width 6), ID (5), Sec (5) ,Name (25), Inst (20), Hours (5), Seats (5), Enroll. (7) ) for other listing selections too. Data cell should be aligned with column header and left justified. A course name should not have a word more than 5 chars. For instance Algorithms should be abbreviated as “Algor”. The length of course name will not exceed 25 chars. In option 5, the above format should be used to save a listing to a file as .csv format. User will be able to provide a filename for csv file.
For this program you need to develop at least one class (chapter 10) with (possible) many methods.