Key to Forensic Linguistics is the idea that there’s an identifiable set of words in everyone’s language – and those identifiable features are basically unique to ourselves.
An example is that I spell certain words wrong, and reverse several letters – My i’s and my e’s are always the wrong way round, so I have to spell check before posting.  But, if you see information that I’ve posted ‘on the fly’, you may find that I’ve spelled because ‘becuase’ or their ‘thier’ among other things.
You might say that it’s simply a spelling mistake, and a very common one at that, but if you identify that as an element of someone’s written style, and they choose not to correct via spell-checking, you can sometimes identify people by simply that.

Other ways include using substituted words – mixed up words with similar definitions, or just completely the opposite words.  That’s a basic idea anyway ;)

Corpus = the internal dictionary we all use?

In some ways, you could consider the corpus as your internal dictionary.  Each of us should have a unique one, or at least identifiably unique features in our corpus.

A more accepted definition of corpus is one of a wider context – a body of texts that make up a sample of the language that it’s supposed to represent, or similar.  But I believe each writer has their own body of work, and therefore, their own comparable ‘corpus’ in some ways.
My first paper on the concept is coming soon, but hopefully this basic definition will help ;)

Share and Enjoy:
  • Print
  • Digg
  • del.icio.us
  • Facebook
  • Google Bookmarks
  • LinkedIn
  • PDF
  • Reddit
  • RSS
  • Technorati
  • Tumblr
  • Twitter
  • Yahoo! Bookmarks