-
textual analysis software for Linux?
I'm new to Linux and have a laptop basically up and running, but an old project has reared up, and I'm looking to learn some software to help with it. I know there's call for this, so I hope maybe someone here can point me to something that will run under Linux, be learnable for a beginner, and do the following:
-search a large set of text for certain strings (or any one of several strings) and output the strings in their context (sentence, marked by space+capital letter on one side and a sentence-ending mark on the other side)
-sort these results into those that have multiple desired string-types
-mark up and store all this in a readable way
I think the answer is that simple scripting will do some of what I want, and I'm trying to see about that, but the texts are not in Latin characters, so I am not entirely sure how to handle that.
Would appreciate any advice anyone can offer.
-
Solely to let you know people do read your postings, I am going to say that the non-Latin characters knock me out of the "helper's box." Sorry.
-
Play around with grep it may do what you need. other wise I would hit the perl books.
-
Natural Language Toolkit (NLTK) is a suite of libraries and programs for symbolic and statistical natural language processing (NLP) for the Python programming language. It provides easy-to-use interfaces to over 50 corpora and lexical resources such as WordNet, along with a suite of text processing libraries for classification, tokenization, stemming, tagging, parsing, and semantic reasoning. NLTK is available for Windows, Mac OS X, and Linux.
Last edited by bestellen; 09-22-2015 at 01:30 PM.
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
|