Skip to content

tulane-cmps6730/sample-project

Repository files navigation

CMPS 6730 Sample Project

This repository contains starter code for the final project in CMPS 4730/6730: Natural Language Processing at Tulane University.

The code in this repository will be copied into your team's project repository at the start of class to provide a starting point for your project.

You should edit this file to include a summary of the goals, methods, and conclusions of your project.

The structure of the code supports the following:

  • A simple web UI using Flask to support a demo of the project
  • A command-line interface to support running different stages of the project's pipeline
  • The ability to easily reproduce your work on another machine by using virtualenv and providing access to external data sources.

Using this repository

  • At the start of the course, students will be divided into project teams. Each team will receive a copy of this starter code in a new repository. E.g.: https://github.com/tulane-cmps6730/project-alpha
  • Each team member will then clone their team repository to their personal computer to work on their project. E.g.: git clone https://github.com/tulane-cmps6730/project-alpha
  • See GettingStarted.md for instructions on using the starter code.

Contents

  • docs: template to create slides for project presentations
  • nlp: Python project code
  • notebooks: Jupyter notebooks for project development and experimentation
  • report: LaTeX report
  • tests: unit tests for project code

Background Resources

The following will give you some technical background on the technologies used here:

  1. Refresh your Python by completing this online tutorial: https://www.learnpython.org/ (3 hours)
  2. Create a GitHub account at https://github.com/
  3. Setup git by following https://help.github.com/en/articles/set-up-git (30 minutes)
  4. Learn git by completing the Introduction to GitHub tutorial, reading the git handbook, then completing the Managing merge conflicts tutorial (1 hour).
  5. Install the Python data science stack from https://www.anaconda.com/distribution/ . We will use Python 3 (30 minutes)
  6. Complete the scikit-learn tutorial from https://www.datacamp.com/community/tutorials/machine-learning-python (2 hours)
  7. Understand how python packages work by going through the Python Packaging User Guide (you can skip the "Creating Documentation" section). (1 hour)
  8. Complete Part 1 of the Flask tutorial, which is the library we will use for making a web demo for your project.