Analysis and visualisation of NLP essay analyses

Project Propser:

Simon Knight

Project Contact:

Simon Knight

Contact Number:



Problem Statement:

For a given text (e.g., a student essay, a Wikipedia article), we can develop sets of NLP indices at various levels (whole text, paragraph, sentence, n-gram, etc.). A core problem is how to present these to instructors and students in ways that are helpful to them. Also fundamental is how to analyse this data relating it to learning (e.g., indices->assessment criteria; indices->clusters of student writing styles or study characteristics; indices->particular positive or negative features of a text, etc.)

Desired Outcome:

In the first instance we’ll probably provide (or work with you to obtain) dummy data (e.g., using Wikipedia articles as source texts).

We currently have a tool – AWA – that uses the Xerox Incremental Parser to analyse a text for ‘rhetorical moves’ at the sentence level, which are then displayed back to the end user as highlighted sentences. This follows a common thread across tools of highlighting at the sentence level (see e.g., also Hemingway and Grammarly). Supplementary information is often provided alongside the highlighting (e.g., specific target changes), which we anticipate adding in future versions of the tool. Other tools – including many with rather sophisticated indices – simply provide numeric outputs of those indices (see e.g., TAACO). Others have built whole editing environments around them (e.g., glosser).

We are working on developing the AWA tool further, including empirical studies with academics to investigate how students adopt the feedback given to them through the tool, and exploration of other tool’s indices for inclusion in the AWA toolset. We are also exploring:


  1. The range of possible methods to visualise the data obtained:
    1. across levels of granularity (whole text, paragraph, sentence)
    2. to aid navigation between salient features of feedback (noting that ‘no highlighting’ is not an insigificant feature)
    3. to allow for comparison between two versions of the same text
    4. and so on
  2. The combinations of indices, particularly those obtained from XIP at the sentence level, and indices obtained at other levels e.g., n-grams from topics, and sentence/paragraph level cohesion features from TAACO.
  3. How indices might be combined and mined (including sequence mining) for novel analytics indicative of salient learning features

Project Stage:

Course Context/s:

Other Information:

Depending on project scoping an individual may be able to tackle parts of the project alone.

There may be opportunities to publish work arising from this project with academic collaborators within CIC and the faculties.

Team Requirements: