Free personas file from Grid-Tools

How do Grid-Tools create and manage personas?

The free personas file contains 30,000 rows of data built up using a combination of coverage techniques and statistical ranges.  Included is also a set of 330,000 links to simulate a network of social connections between people.  The population has been skewed slightly by age to include more 15 to 30-year-olds who tend to be more active on social networks, as shown on the graph below.

personas graph

The columns included in the example file are:

  • Person id 
  • Name,  first name,  last name,  middle names,  title
  • Street address, city, county, post code, home phone,  work phone,  mobile phone
  • NI number
  • Date of birth, age
  • Sex
  • Ethnic origin
  • Marital status
  • Income, credit score, home owner, employment status
  • Number of emails, number of ims
  • Email1 Email2 Email3 Email4
  • IM1 IM2 IM3 IM4
  • Number of social links

The data was created in two stages, firstly using a coverage map and then using Datamaker’s inbuilt functions to further vary the data.

The coverage map included five variables: ethnic origin, sex, marital atatus, age and region.

The age was skewed slightly to include more data in the 15 to 30 range. Coverage techniques allow a smaller set of data to be built using Orthogonal arrays to ensure all paired values are included.  Instead of trillions of combinations, a smaller set of data will provide nearly as much coverage without requiring high volumes.

The 359 initial combinations were increased to 30,000 using the following additional functions:

NAME Made up of ^TITLE^ ^FIRST_NAME^ ^LAST_NAME^
FIRST_NAME    A random list of first names varied by Sex and Ethnic Origin
LAST_NAME      A Random list of last names varied by Ethnic Origin
MIDDLE_NAMES    A random list of middle names varied by Sex and Ethnic Origin, not all people have middle names
TITLE  Adjusted by Sex
STREET_ADDRESS   A random list of street names
CITY           A Random List of Cities based on Region
COUNTY The Region - picked up from the coverage map
POST_CODE A Random List of Post Codes based on Region
HOME_PHONE   A random phone number, the STD is based on region - also adjusted for age
WORK_PHONE        A random phone number, the STD is based on region - also adjusted for age
MOBILE_PHONE      A random phone number, the dialing code is picked from standard mobile numbers - also adjusted for age
NI_NUMBER A Random NI number
DOB     Picked up from the coverage map but adjusted plus or minus 4 years
AGE   Calculated from the DOB
SEX   Picked up from the coverage map
ETHNIC_ORIGIN Picked up from the coverage map
MARRITAL_STATUS   Picked up from the coverage map - also adjusted for age
INCOME   Adjusted for age
CREDIT_SCORE Adjusted for age
HOME_OWNER  Adjusted for age, but based on: Homeowner; Renting; Mortgage; Council House; No Fixed Abode
EMPLOYMENT_STATUS Adjusted for age
NO_OF_EMAILS   Adjusted for age
NO_OF_IMS   Adjusted for age
EMAIL1,2,3,4   Based on the name and based on no_of_emails
IM1,2,3,4  Based on the no_of_ims
ACTUAL_SOCIAL_LINKS A cross reference list matching the number of contacts via IM and Email

 

The social links allow you to traverse out from a single person into their social network.  For example:

Back to the top