@soontobecolleger No worries. I understand how hectic college stuff is
I think some people make generalizations about computational linguistics. A family friend that got his masters in Hispanic Linguistics likes to say that computational linguistics is really just computer science and that the data being focused happens to be language-based. In fact, he said the data might as well be medical records because he sees computational linguistics as having nothing to do with linguistics. To be fair, he hasn’t really explored this area of linguistics, so I’m writing off his view as a misconception
I had the opportunity to take a graduate course in computational linguistics. In the course, we were each assigned a language - mine was Polish. We used some simple Python scripts to explore the language that we had been given. For example, we ran a straightforward frequency analysis and were able to determine that the majority of the most common words were of the functional category (prepositions, articles, etc.). We then ran other scripts to perform bigram analyses to aid in determining how similar one word was to another. This produced relatively disastrous results (from a linguistics perspective). A computer scientist would likely not have cared about the results and deemed that they were “good enough”, but it was quite depressing to see which words were shown to be highly similar. On the other side of things, I got to take a course that basically had to do with speech recognition. It went over the history of speech recognition and how we’re currently at a point in which statistics is mainly used to figure out what a user is trying to say, but of course, this isn’t good enough. We need to teach such systems how humans communicate (how the tongue functions and what sounds are produced), but this is obviously something very difficult and time consuming to teach a computer (and there’s the possibility that we don’t even know how to approach this), so the statistics based method wins out when the results are decent enough.
Overall, computational linguistics is basically looking at vast amount of language data and trying to teach a computer something based off of the patterns that can be gleamed from the data. I was also really interested in the speech recognition stuff, but my college only had the one course in that subject area. The programming stuff wasn’t that bad. It was mainly Python and Matlab, which are relatively easy languages for most students to pick up. Python especially has nice and easy syntax to remember.
Linguist List is pretty awesome! I’m not sure how I stumbled upon it myself, but I like it