<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title></title>
	<atom:link href="http://www.grid-tools.com/blog/feed" rel="self" type="application/rss+xml" />
	<link>http://www.grid-tools.com/blog</link>
	<description></description>
	<lastBuildDate>Tue, 20 Apr 2010 08:44:04 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Magical mystery tour</title>
		<link>http://www.grid-tools.com/blog/uncategorized/magical-mystery-tour</link>
		<comments>http://www.grid-tools.com/blog/uncategorized/magical-mystery-tour#comments</comments>
		<pubDate>Tue, 20 Apr 2010 08:44:04 +0000</pubDate>
		<dc:creator>Huwprice</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Data masking]]></category>
		<category><![CDATA[test data]]></category>
		<category><![CDATA[Test Data Management]]></category>

		<guid isPermaLink="false">http://www.grid-tools.com/blog/uncategorized/magical-mystery-tour</guid>
		<description><![CDATA[Roll up, roll up. I’m going on a mystery tour and I’m inviting you along. I’m going to get in touch with the financial organisations that are part of my life and I’m going to ask them about what journey my personal details can expect to take once they become test data. 
It should prove [...]]]></description>
			<content:encoded><![CDATA[<p>Roll up, roll up. I’m going on a mystery tour and I’m inviting you along. I’m going to get in touch with the financial organisations that are part of my life and I’m going to ask them about what journey my personal details can expect to take once they become test data. </p>
<p>It should prove quite interesting as I’m not convinced that taking a copy of production data and using minimal masking techniques is quite as robust as their millions of customers imagine it to be. </p>
<p>In fact, if the average punter knew that copies of production data are used in development &#8211; often offshore &#8211; there would be an audible gasp of amazement. What? You have reassured us all with assurances about state-of-the-art security systems for live data only to let copies walk out the back door. And the reason why? Because they are only being used for development? </p>
<p>Well call me pernickety but that the simple issue of using a copy of live data is enough to make me sit up my keyboard and write me a letter. I’ll keep you posted with what I find out. </p>
]]></content:encoded>
			<wfw:commentRss>http://www.grid-tools.com/blog/uncategorized/magical-mystery-tour/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Green Giant</title>
		<link>http://www.grid-tools.com/blog/uncategorized/green-giant</link>
		<comments>http://www.grid-tools.com/blog/uncategorized/green-giant#comments</comments>
		<pubDate>Wed, 31 Mar 2010 15:01:23 +0000</pubDate>
		<dc:creator>VanessaHoward</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Test Data Creation]]></category>
		<category><![CDATA[test data generation]]></category>
		<category><![CDATA[Test Data Management]]></category>

		<guid isPermaLink="false">http://www.grid-tools.com/blog/uncategorized/green-giant</guid>
		<description><![CDATA[It is one thing to turn down the thermostat a notch or two at home and think about buying that hybrid car but how is your carbon footprint shaping up in the work environment? 
Not good by most measures but the real shock comes with the realisation that something as simple (cough) as data testing [...]]]></description>
			<content:encoded><![CDATA[<p>It is one thing to turn down the thermostat a notch or two at home and think about buying that hybrid car but how is your carbon footprint shaping up in the work environment? </p>
<p>Not good by most measures but the real shock comes with the realisation that something as simple (cough) as data testing and development is a massive green no-no.</p>
<p>The standard practice of using copies of production data in testing and development can mean that many organisations are apparently making 24 copies of the primary databases annually with some even noting that demands have meant that up to 120 copies are made each year. </p>
<p>It can’t be a surprise then to learn that copies are a massive drain on power, cooling and storage.</p>
<p>If the size of the production database is 500 GB then ten non-production copies will weigh in at 5 TB. That’s big, scary and unwieldy, hey, just like the now-dropped-from-production Humvee…</p>
<p>I’m guessing that the demand for the ‘cooler hybrid options’ of the testing and development world will grow and that means more attention should be given to the brave new world of synthetic data generation.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.grid-tools.com/blog/uncategorized/green-giant/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Why masked data is no good!</title>
		<link>http://www.grid-tools.com/blog/uncategorized/why-masked-data-is-no-good</link>
		<comments>http://www.grid-tools.com/blog/uncategorized/why-masked-data-is-no-good#comments</comments>
		<pubDate>Wed, 31 Mar 2010 14:52:28 +0000</pubDate>
		<dc:creator>Huwprice</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[data creation]]></category>
		<category><![CDATA[Data masking]]></category>
		<category><![CDATA[de-identify data]]></category>
		<category><![CDATA[test data]]></category>
		<category><![CDATA[Test Data Management]]></category>

		<guid isPermaLink="false">http://www.grid-tools.com/blog/uncategorized/why-masked-data-is-no-good</guid>
		<description><![CDATA[I’m a keen bright developer working for a bank working on a new report, the DBAs have given me a full copy of masked production data.  To test I need to find some data that’s changed over time.  Where should I start?  I think my boss has just had a big pay [...]]]></description>
			<content:encoded><![CDATA[<p>I’m a keen bright developer working for a bank working on a new report, the DBAs have given me a full copy of masked production data.  To test I need to find some data that’s changed over time.  Where should I start?  I think my boss has just had a big pay rise so let’s try and find her, already I have three pieces of information a) The sex is female ; b) The monthly direct debit has increased by over 10% and c) it happened in the last 30 days.  There are a million customers in the bank a) reduces them to 500,000 b) reduces them to 2,212 and c) reduces them to 38.  Now I have a list of 38 people, let’s look for when the date the annual company bonus is paid and bingo there she is!  What a clever developer I am, now I can run off my reports and present them to my boss, won’t she be pleased.</p>
<p>A combination of good old human curiosity will generally find a way.  For most complex systems there is so much information that finding the intersection of a few data points will usually get you to the data you need.  If you look at pretty much any HR or Health care system there are so many data points that the complexity of trying to second guess human curiosity is mind boggling.  Changing a name and address is not enough!   </p>
<p>The only safe way is to generate the data based on the characteristics of production, not the actual production data!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.grid-tools.com/blog/uncategorized/why-masked-data-is-no-good/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>True or False?</title>
		<link>http://www.grid-tools.com/blog/uncategorized/true-or-false</link>
		<comments>http://www.grid-tools.com/blog/uncategorized/true-or-false#comments</comments>
		<pubDate>Wed, 31 Mar 2010 14:51:04 +0000</pubDate>
		<dc:creator>Huwprice</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[test data]]></category>
		<category><![CDATA[Test Data Creation]]></category>
		<category><![CDATA[Test Data Management]]></category>

		<guid isPermaLink="false">http://www.grid-tools.com/blog/uncategorized/true-or-false</guid>
		<description><![CDATA[There is that old joke about a guy lost in the countryside who stops and asks for directions only to be told: “If I were you, I wouldn’t start from here.” 
I thought about that baffling wisdom during a conversation about the use of production data in testing and development. If you were to ask [...]]]></description>
			<content:encoded><![CDATA[<p>There is that old joke about a guy lost in the countryside who stops and asks for directions only to be told: “If I were you, I wouldn’t start from here.” </p>
<p>I thought about that baffling wisdom during a conversation about the use of production data in testing and development. If you were to ask why copies are used, what answers do you think you‘ll hear? </p>
<p>In the main, I’m guessing it will be because that’s the way it has always been done. Sure, but dig further and underpinning the ‘why’ is belief that it is the best way to ensure quality in systems testing. So the challenge comes when you’re told that it simply isn’t true. </p>
<p>Yes, production data is just the starting point and manual methods are then used to edit and extract relevant data but do conventional methods meet project objectives? Put aside for a second the need to mask sensitive live data and look again at the issue of quality. </p>
<p>Undoubtedly, quality data is the key to success but a copy of production data does not guarantee that &#8211; all it guarantees is high volume. </p>
<p>Manual techniques can enhance test data but not to any significant degree. Even automation tools such as QTP and LoadRunner improve richness but only to a certain extent and the risk is that defects and bugs remain undetected. </p>
<p>So working with the ‘if I were you, I wouldn’t start from here’ logic, projects should start with low volume data that offers a rich combination of scenarios. </p>
<p>And what is the quickest, most cost effective and timely way to achieve that? With software that is capable of creating data &#8211; (no more fears over live data leaks) &#8211; and that generates a richer coverage of characteristics that will deliver the functionality that you and your team have been tasked with. True or false? </p>
]]></content:encoded>
			<wfw:commentRss>http://www.grid-tools.com/blog/uncategorized/true-or-false/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Long arm of the Law</title>
		<link>http://www.grid-tools.com/blog/uncategorized/long-arm-of-the-law</link>
		<comments>http://www.grid-tools.com/blog/uncategorized/long-arm-of-the-law#comments</comments>
		<pubDate>Fri, 26 Feb 2010 17:01:48 +0000</pubDate>
		<dc:creator>VanessaHoward</dc:creator>
				<category><![CDATA[Synthetic versus Masked Data]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Data masking]]></category>
		<category><![CDATA[data obfuscation]]></category>
		<category><![CDATA[data security]]></category>
		<category><![CDATA[Synthetic test data]]></category>

		<guid isPermaLink="false">http://www.grid-tools.com/blog/uncategorized/long-arm-of-the-law</guid>
		<description><![CDATA[Sometimes governments can be forward-thinking &#8211; no, really bear with me &#8211; and initiatives arrive you can quietly applaud. 
The Welsh Assembly Government’s e-Crime Wales is a partnership of organisations, agencies and the police &#8211; that it has dedicated police business liaison officers is thought to be a world’s first. 
Data loss and security breaches [...]]]></description>
			<content:encoded><![CDATA[<p>Sometimes governments can be forward-thinking &#8211; no, really bear with me &#8211; and initiatives arrive you can quietly applaud. </p>
<p>The Welsh Assembly Government’s e-Crime Wales is a partnership of organisations, agencies and the police &#8211; that it has dedicated police business liaison officers is thought to be a world’s first. </p>
<p>Data loss and security breaches are a growing problem for businesses everywhere and Detective Constable John Cherry recently confirmed something most organisations are aware of when he said: “I have found that 70 per cent of threats come from within companies, either through malicious abuse of data or simple employee ignorance of existing threats.” </p>
<p>A Computer Security, Issues, &amp; Trends report placed the risk of security breaches from employee and former employees even higher, at 81 per cent. </p>
<p>But when government gives with one hand it can certainly take with the other and data protection issues are ever pressing (no matter that government departments have breached their own DPA principles). </p>
<p>When the Nationwide Building Society was fined £980,000 by the FSA for failing to manage information security back in 2007, everyone sat up and took notice. </p>
<p>Now the UK Information Commissioner has made it clear that companies found wanting can be hit by ‘unlimited fines’ and that it is down to the ‘data controller to comply with the data protection principles’, day-to-day demands on data use are brought into sharp relief. </p>
<p>No matter the security polices that are drafted, unless operational integrity is in place, good intentions will come unstuck. And when it comes to testing and development, testing on copies of production data contains unavoidable risks.</p>
<p>Discussions have recently centered on using representative or &#8220;fake&#8221; data for testing and development.  No, there is no other secure way &#8211; not even data obfuscation. Seeing as we&#8217;re now hearing that masking algorithms can easily be re-engineered (read: http://www.guardian.co.uk/technology/2010/jan/24/computer-security-crime-anonymous-datasets) the time is now to get on the train.   </p>
<p>Even if you set aside the advantages of the time and space saved, synthetic data is the most secure and completely compliant &#8211; it seems the case for producing synthetic data is copper-bottomed. </p>
]]></content:encoded>
			<wfw:commentRss>http://www.grid-tools.com/blog/uncategorized/long-arm-of-the-law/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>HIPAA QA and Grid-Tools Form Joint Venture Enabling Advanced Test Data Solutions Offering Greater Interoperability Across HIPAA 5010 and ICD-10</title>
		<link>http://www.grid-tools.com/blog/uncategorized/hipaa-qa-and-grid-tools-form-joint-venture-enabling-advanced-test-data-solutions-offering-greater-interoperability-across-hipaa-5010-and-icd-10</link>
		<comments>http://www.grid-tools.com/blog/uncategorized/hipaa-qa-and-grid-tools-form-joint-venture-enabling-advanced-test-data-solutions-offering-greater-interoperability-across-hipaa-5010-and-icd-10#comments</comments>
		<pubDate>Tue, 02 Feb 2010 09:41:45 +0000</pubDate>
		<dc:creator>Jess3589</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Data masking]]></category>
		<category><![CDATA[data masking HIPAA]]></category>
		<category><![CDATA[HIPAA 5010 test data]]></category>
		<category><![CDATA[HIPAA data compliance]]></category>
		<category><![CDATA[HIPAA test data]]></category>

		<guid isPermaLink="false">http://www.grid-tools.com/blog/uncategorized/hipaa-qa-and-grid-tools-form-joint-venture-enabling-advanced-test-data-solutions-offering-greater-interoperability-across-hipaa-5010-and-icd-10</guid>
		<description><![CDATA[February 1, 2010 &#8211; Naples, FL., London, UK;
HIPAA QA, Inc., a strategic healthcare IT quality assurance service provider focused on delivering best-in-class interoperable testing solutions for HIPAA 5010 and ICD-10 is pleased to announce a joint venture partnership with Grid-Tools Limited, a leading global vendor of data creation, data masking and test data management software.
The [...]]]></description>
			<content:encoded><![CDATA[<p>February 1, 2010 &#8211; Naples, FL., London, UK;<br />
HIPAA QA, Inc., a strategic healthcare IT quality assurance service provider focused on delivering best-in-class interoperable testing solutions for HIPAA 5010 and ICD-10 is pleased to announce a joint venture partnership with Grid-Tools Limited, a leading global vendor of data creation, data masking and test data management software.<br />
The joint venture will produce a X12N/HIPAA module for Datamaker™, Grid-Tools signature test data management solution, and HIPAAdvanced™, a test data management service from HIPAA QA, Inc. HIPAAdvanced™ is designed as vendor agnostic, making it available to all industry segments &#8211; payers, providers, clearinghouses, software vendors, integrators and outsourcing companies. Common industry test data sets enable robust internal testing, interoperable external testing and also test vendor&#8217;s 5010 software product delivery and test EDI translator&#8217;s HIPAA edit interpretations. This service guarantees the most interoperable test data available in the market today, and any ambiguities in HIPAA test files will be verified using X12&#8217;s interpretation portal.<br />
Mark Lott, President of HIPAA QA highlights &#8220;This partnership enables delivery of healthcare&#8217;s only full service solution offering secure PHI data masking, 5010 test data generation with data-on-demand and allows for seamless integration with all major test automation tools encompassing all X12N transaction sets, data relationships, legacy data formats and HIPAA 5010 file interdependencies.&#8221; As the original architect of HCCO CCAP, the healthcare&#8217;s industry&#8217;s first HIPAA 4010A common EDI interpretation testing and certification program, Mr. Lott has a unique perspective in what it takes to deliver interoperability to HIPAA transactions and having the tools and methodology to create robust test data is a crucial first step. Additional CCAP details can be found at www.hipaaconformance.org/index.htm.<br />
Huw Price, Grid-Tools Managing Director, says &#8220;Leveraging our expertise in supplying test data to protect highly sensitive environments in banking, government, healthcare and pharmaceuticals, we have partnered with HIPAA QA to deliver this very important toolset for the US healthcare market. GT Datamaker™ contains an extremely sophisticated data masking and data obfuscation solution that de-identifies sensitive and private data records so they can be used outside of live production environments. The tool also delivers automated file generation, conversion and can obfuscate or anonymize PHI data while leaving relationships and referential integrity in place across the data landscape.&#8221;<br />
Mr. Lott comments &#8220;We are excited to demonstrate to industry leaders how this strategic, innovative and proven solution is uniquely qualified to fill the test data gaps currently being experienced within HIPAA 5010 and ICD-10 implementation efforts. The industry needs a common solution that will enable large scale success for the industry and our solution does exactly that by providing a single source solution that payer, providers and clearinghouses can all test with sharing similar test data enabling interoperability at its highest possible level. Without a doubt highly advanced testing and test data solutions will be critical to unequivocally validate that ICD-10 is implemented correctly with no negative financial impacts to healthplans and providers.&#8221;<br />
Huw Price adds &#8220;In light of HIPAA privacy and security regulations combined with the enhanced prohibited use of production data containing PHI, the industry needs a comprehensive secure solution. Organizations are looking for a tool and service which<br />
provides development and test teams with secure, unidentifiable and valid test data that exercises codebases without the use of production data.&#8221;<br />
About Grid-Tools Limited: Grid-Tools are specialists in data creation, data masking and test data management. Their experienced personnel have been writing and developing solutions for large companies in both the private and public sectors for over 30 years. The Grid-Tools Datamaker™ suite includes a wide range of tools for test data management including such innovative products as Datamaker™, a revolutionary tool that creates and publishes quality test data with the referential integrity of production environments for testing and development. Datamaker™ offers three methods for managing and generating data, inclusive of database subsetting, data obfuscation (data masking and data de-identification) and synthetic data generation. The Grid-Tools methodology consists of using the “data-centric” approach to testing whereby, their focus is to ensure the quality of the test data you are using is of the right quality for successful testing. About HIPAA QA, Inc.: HIPAA QA, Inc., is a strategic healthcare IT quality assurance service partner focused on delivering best-in-class interoperable testing solutions for HIPAA 5010 and ICD-10 and leverages extensive experience in testing methodologies, test data management, configuration management and test automation practices along with 10 yrs experience in HIPAA testing, certification and test data solutions.<br />
HIPAA QA Contacts:<br />
HIPAA QA, Inc.<br />
8951 Bonita Beach Rd SE<br />
Suite 525<br />
Bonita Springs, FL. 34134<br />
Office &#8211; 866-812-3411<br />
www.hipaaqa.com<br />
Grid-Tools Contacts: Grid-Tools Limited 11 Oasis Business Park<br />
Eynsham<br />
Oxfordshire<br />
OX29 4TP Phone Europe UK: +44 (0) 1865 884600 Phone USA: 1-866-563-3120 Web: www.grid-tools.com Email: info@grid-tools.com<br />
NOTE: All trademarks and registered trademarks are the properties of their respective owners.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.grid-tools.com/blog/uncategorized/hipaa-qa-and-grid-tools-form-joint-venture-enabling-advanced-test-data-solutions-offering-greater-interoperability-across-hipaa-5010-and-icd-10/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Using data masking to create &#8216;anonymous&#8217; datasets</title>
		<link>http://www.grid-tools.com/blog/uncategorized/using-data-masking-to-create-anonymous-datasets</link>
		<comments>http://www.grid-tools.com/blog/uncategorized/using-data-masking-to-create-anonymous-datasets#comments</comments>
		<pubDate>Mon, 25 Jan 2010 10:57:40 +0000</pubDate>
		<dc:creator>Jess3589</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Data masking]]></category>
		<category><![CDATA[data masking methods]]></category>
		<category><![CDATA[data masking techniques]]></category>
		<category><![CDATA[de-identify sensitive data]]></category>

		<guid isPermaLink="false">http://www.grid-tools.com/blog/uncategorized/using-data-masking-to-create-anonymous-datasets</guid>
		<description><![CDATA[Yesterday morning I woke to find an interesting article in &#8216;The Observer&#8217; about anonymizing or masking personal data records. This turned out to be somewhat ironic considering I wrote two blogs on this very topic just last week!
The article, written by Anushka Asthana (Policy Editor), discussed the concerns around using data masking or &#8220;anonymization techniques&#8221; [...]]]></description>
			<content:encoded><![CDATA[<p>Yesterday morning I woke to find an interesting article in &#8216;The Observer&#8217; about anonymizing or masking personal data records. This turned out to be somewhat ironic considering I wrote two blogs on this very topic just last week!</p>
<p>The article, written by Anushka Asthana (Policy Editor), discussed the concerns around using data masking or &#8220;anonymization techniques&#8221; to de-identify sensitive and personal information. Many large and well-known multi-national organizations and government agencies are using data masking methods to keep their production data secure (or anonymous) when they use it in development and test.</p>
<p>Anushka&#8217;s article, however, states that computer scientists in the US have discovered ways to &#8220;re-identify&#8221; the personal information of individuals who were included in anonymous datasets. How?, through using a statistical &#8220;de-anonymization&#8221; technique or, as my last blog suggested, re-engineering of masked test data.</p>
<p>So, as Anushka&#8217;s article asks, just how safe is it to share personal and sensitive information even if it is masked or de-identified? The answer, once again, is not very.</p>
<p>Organizations should start looking into other methods to secure their production data when using it outside of their &#8220;live&#8221; environment; whether this be for testing, development, training, QA or even presenting statistical information. My last two blogs discuss the option of using &#8216;data creation&#8217; techniques. No, this isn&#8217;t the process of &#8220;creating&#8221; or making-up some data based on whatever fake names or addresses come into your head. It&#8217;s quite a sophisticated process, and the end-product is secure test data that can never be re-engineered. It&#8217;s based on a model of your production environment, so the data maintains referential integrity and is exactly like &#8220;live&#8221; data, but it isn&#8217;t. Read my last two blogs to find out more.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.grid-tools.com/blog/uncategorized/using-data-masking-to-create-anonymous-datasets/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>So, why are you still using data masking?</title>
		<link>http://www.grid-tools.com/blog/uncategorized/so-why-are-you-still-using-data-masking</link>
		<comments>http://www.grid-tools.com/blog/uncategorized/so-why-are-you-still-using-data-masking#comments</comments>
		<pubDate>Thu, 21 Jan 2010 11:25:08 +0000</pubDate>
		<dc:creator>Jess3589</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[data creation]]></category>
		<category><![CDATA[Data masking]]></category>
		<category><![CDATA[data masking methods]]></category>
		<category><![CDATA[Test Data Creation]]></category>
		<category><![CDATA[Test Data Management]]></category>

		<guid isPermaLink="false">http://www.grid-tools.com/blog/uncategorized/so-why-are-you-still-using-data-masking</guid>
		<description><![CDATA[Most of us have no idea when it comes to figuring out ways to acquire the right kind of data we need for any type of test or development project.  We’re lost.  We’ve been taking copies of our live environments for years.  It’s the only method we know.  
There is only [...]]]></description>
			<content:encoded><![CDATA[<p>Most of us have no idea when it comes to figuring out ways to acquire the right kind of data we need for any type of test or development project.  We’re lost.  We’ve been taking copies of our live environments for years.  It’s the only method we know.  </p>
<p>There is only one issue changing our IT infrastructures and threatening the comfortable and reliant procedures we know good and well; compliance.  Yes, we’re now in the data protection era.  Regulating bodies would rather personal information be kept safe and secure inside the production database where it belongs, thank-you-very-much.  Also, ignoring compliance measures isn’t really the best idea.  We’ve learnt this firsthand from our competitors and their very public data breaches whilst secretly smiling to ourselves, relieved it didn’t happen to us.  </p>
<p>However, let us consider for a moment the potential implications of your organization’s sensitive data being leaked into the public domain by some unfortunate soul’s mindless mistake:</p>
<p>•	Potential law suit – check<br />
•	Damage to corporate brand and reputation – check<br />
•	Large fines imposed by regulating bodies – check<br />
•	Loss of integrity and potential loss of customers &#8211; check    </p>
<p>So, what about data masking – you ask? It’s a quick and easy way to solve the problem, right?  Let’s get a bit of production data, scramble it up, de-identify some names and &#8211; there you go.  You’ve got your test data and the perfect solution to the thorn digging into your side.  </p>
<p>Oh, but wait.  We now hear data masking isn’t actually that secure.  Bugger. </p>
<p>Then what alternative do we have? Well, using &#8217;synthetic&#8217; or &#8216;fake&#8217; test data seems to be the topic of the day – the new method, the new ‘fad’ if you will.</p>
<p>Now, for those of you who don’t know, test data creation is a bi-product of modeling and sampling production data.  What is modeled is then turned into data objects or templates which are based on the entire production environment.  The templates can be edited or enhanced based on project needs, or moved into different data formats like flat files or CSV files.  </p>
<p>This may sound impossible to you.  You may be thinking about the referential integrity of your production database; the value of every table, every cell, every format, every name, how each table is ever so consequentially connected to one another.  But it&#8217;s rather easy, actually &#8211; very easy to be frank. </p>
<p>I’ve started looking into this as part of my consultancy.  It came to light when I was contacted by a government agency needing an alternative to data masking.  I’m impressed.  It is true – using test data creation is the way forward.  On top of this, less data is actually being used and stored, since the data is generated from the model into a small template.  Also, believe it or not, synthetic data gives you better code and functional coverage.  How?  Well, it’s easier than you think.  You simply point your test data creation tool toward its data editing, enhancing and manipulation functions.  </p>
<p>Oh, and yes, I nearly forgot about the most important point.  More secure than data masking, you say?! Well, it is.  This is because data creation tools never actually access or manipulate ‘live’ production data.  They can, in fact, create data without data. Confusing?  Most data masking tools anonymize your test and development data by moving live production data into a separate staging environment so it can then be masked. How is this secure?  Well, it’s not really.  Likewise, I bet you didn’t know that masked data can be reengineered back into its original format fairly easily.  Synthetic data can never be reengineered because it’s not actually ‘real’ data.</p>
<p>So, why are you still using data masking?  My guess is because you don’t know enough about data creation &#8211; start reading!  It’s the way forward.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.grid-tools.com/blog/uncategorized/so-why-are-you-still-using-data-masking/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>When security is a concern, synthetic data generation is unmatched</title>
		<link>http://www.grid-tools.com/blog/data-masking-blog/synthetic-versus-masked-data/when-security-is-a-concern-synthetic-data-generation-is-unmatched</link>
		<comments>http://www.grid-tools.com/blog/data-masking-blog/synthetic-versus-masked-data/when-security-is-a-concern-synthetic-data-generation-is-unmatched#comments</comments>
		<pubDate>Wed, 09 Dec 2009 03:50:33 +0000</pubDate>
		<dc:creator>JamesKoopmann</dc:creator>
				<category><![CDATA[Synthetic versus Masked Data]]></category>

		<guid isPermaLink="false">http://www.grid-tools.com/blog/?p=107</guid>
		<description><![CDATA[Gartner reports that more than 80% of companies will use live production data for nonproduction purposes. This can include BI, QA, Development, and Testing. Clearly we all know by now the problems with moving production data around to non-production and non-secure environments. I mean really, how many times do we have to read about data theft? The problem here lies in the fact that these systems need data. So why do we just copy production data? I’ll tell you, it’s just darn too easy to get what you want and for the most part intentions to not leak information are usually good.]]></description>
			<content:encoded><![CDATA[<p>Gartner reports that more than 80% of companies will use live production data for nonproduction purposes. This can include BI, QA, Development, and Testing. Clearly we all know by now the problems with moving production data around to non-production and non-secure environments. I mean really, how many times do we have to read about data theft? The problem here lies in the fact that these systems need data. So why do we just copy production data? I’ll tell you, it’s just darn too easy to get what you want and for the most part intentions to not leak information are usually good.</p>
<p>Well, unfortunately most data centers are starting to read the writing on the wall and beginning to discourage just copying data from production system to test system to QA system to who knows where. Data centers are now implementing one of two types of methods, masking data or creating synthetic data, to help secure sensitive information in transit to these secondary systems. But how secure are either of these two methods? Is one better than the other?</p>
<p><strong>Data masking</strong> is very easy to understand and involves the obscuring of specific pieces of information—ensuring that sensitive information is replaced with realistic data but data that doesn’t identify the data it has replaced. So if John Doe had a SSN, with data masking, we would replace his SSN with another SSN that obviously could not be traced or reverse engineered to get back at John Doe’s original SSN. But how secure is this method of masking data? There are two very typical scenarios which include:</p>
<ul>
<li>The inability to understand what needs to be masked. Let’s face it databases are complex, the information is often obscure, and it takes years to master the schemas of these systems. Unfortunately many database systems are being managed by teams that do not fully understand the data. This leaves their ability to mask all the sensitive data that databases contain to nothing more than a shot in the dark—creating a scenario where sensitive information is copied throughout the enterprise and leaving a gaping security hole</li>
<li>The inability to use a masking tool effectively. Call this operator error but many times the copy mechanism is performed before masking takes place either without knowing or to be masked on the target system. The fact of the matter is that under both scenarios sensitive information has left the production system, has traveled through the network, and landed on an unsecure system. Again, leaving sensitive data vulnerable as it travels on the network and for a time when not masked on the target system.</li>
</ul>
<p><strong>Synthetic Data Generation </strong>on the other hand is a process of creating real data but through a detailed process of data anonymization, that is the creation of data that has no real identity. The most important aspect of synthetic data generation is that because it doesn’t strictly rely on production data, rules or conditions can be met to generate data that is able to test certain aspects of an application or system that normal production data might not be able to. How secure is synthetic data generation?</p>
<ul>
<li>Clearly, synthetic data generation is the ultimate in protecting the privacy of real production data. No longer are test systems tied to production data and no longer is production data flying through the enterprise unsecured.</li>
</ul>
<p>I enjoy some banter on this subject as for me it would seem that synthetic data generation is the most secure method of providing test data throughout the enterprise. I can see how many would be thrown back by what would seem to be a very complex initiative. But when using tools such as <a href="../../datamaker.php">Datamaker</a> from <a href="../../index.php">Grid-Tools</a> I see this aversion to creating test data being reduced. Plus, with the added value of ensuring representative data is always available for testing, synthetic data generation can be extremely beneficial in providing data that goes far beyond just raw production data while encompassing relationship, properties, and nuances of data not normally seen in production.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.grid-tools.com/blog/data-masking-blog/synthetic-versus-masked-data/when-security-is-a-concern-synthetic-data-generation-is-unmatched/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Grid-Tools brings objects to synthetic data generation</title>
		<link>http://www.grid-tools.com/blog/test-data-creation/grid-tools-brings-objects-to-synthetic-data-generation</link>
		<comments>http://www.grid-tools.com/blog/test-data-creation/grid-tools-brings-objects-to-synthetic-data-generation#comments</comments>
		<pubDate>Fri, 13 Nov 2009 12:17:51 +0000</pubDate>
		<dc:creator>JamesKoopmann</dc:creator>
				<category><![CDATA[Test Data Creation]]></category>
		<category><![CDATA[application performance]]></category>
		<category><![CDATA[database performance]]></category>
		<category><![CDATA[synthetic]]></category>
		<category><![CDATA[Synthetic test data]]></category>

		<guid isPermaLink="false">http://www.grid-tools.com/blog/?p=96</guid>
		<description><![CDATA[Developing an application to meet business needs relies not only on technology but also on the ability to test the technology used. In a recent study, Gartner identified key reasons why projects fail. Within the top 5 reasons it was recognized that “Too many project changes resulted in an overly complex system that was hard to test.” Likewise, Gartner also recognized a “hit list” of the most common problems associated with the increased risk of failures in SOA projects included “no proof of concept and no stress tests”.]]></description>
			<content:encoded><![CDATA[<p>Developing an application to meet business needs relies not only on technology but also on the ability to test the technology used. In a <a href="http://blogs.gartner.com/road-notes/2009/09/09/verifying-and-validating-to-reduce-project-failures/">recent study</a>, Gartner identified key reasons why projects fail. Within the top 5 reasons it was recognized that “Too many project changes resulted in an overly complex system that was hard to test.” Likewise, Gartner also recognized a “hit list” of the most common problems associated <a href="http://www.gartner.com/it/page.jsp?id=508397">with the increased risk of failures in SOA projects</a> included “no proof of concept and no stress tests”.</p>
<p>Anyone who has had to test applications clearly understands that proper application design and functional testing is directly related to the availability of a representative dataset. Let’s face it, we all know that testing applications is vitally important but we often neglect performing any extensive testing because it is considered to be too difficult and time consuming to find, message, or generate any decent test data. Likewise, both application and database performance testing is often inadequate—inhibiting the ability intelligently modify database structures to approximate database performance for application usage.</p>
<p>It would seem to most that using real-world data would be the best for performing application and database performance testing. Unfortunately real-world data is not always available, may be hard to acquire, doesn’t exist in sufficient quantity, or doesn’t have or is missing important properties (PKs, FKs, etc.). For these reasons, synthetic data generation is a viable alternative that helps ensure representative data is always available for testing. Synthetic data generation can be extremely beneficial in that it can:</p>
<ol>
<li>Data goes far beyond just raw data      and encompasses relationships, properties, and nuances of data. It is this      ability of a synthetic generation tool to build in these types of      relationships that helps perform the fine-grain testing within      applications.</li>
<li>With a complete and representative      dataset, database parameters, queries, indexing strategies, table      structures, etc. can be tested and modeled to help tune database      performance.</li>
<li>Instead of using ‘caned’ data, it is      much easier to model and introduce outlying data and ask important      questions as to how an application might behave.</li>
<li>It is much easier to create      standardized benchmarking datasets that can be reused as application or      database design changes.</li>
</ol>
<p>Objects and object-oriented design has become main-stream within many data centers. Compiling attributes into object entities and modeling interacting objects is almost second nature. <a href="http://www.grid-tools.com/index.php">Grid-Tools</a> has brought the concept of objects to synthetic data generation—enabling users of their <a href="http://www.grid-tools.com/datamaker.php">Datamaker</a> tool to assemble data objects together for the generation of test data. For instance an application that requires data representative of an order system might link together separate objects that represent internet, phone, or walk-in orders; each type of order object containing fixed or variable attributes that uniquely define the object.</p>
<p>The specification of a data objects ensures the representation of data that includes, but not limited to, fixed and variable data, inheritance by other objects, have a defined structure, have attributes, can relate to external data, can contain expected results, and can contain test coverage data. Data objects is what drives the power of Grid-Tools Datamaker to generate a rich and meaningful spread of data that will be representative of the application intended to be tested—guaranteeing maximum coverage of a code base.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.grid-tools.com/blog/test-data-creation/grid-tools-brings-objects-to-synthetic-data-generation/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
<a class="a2a_dd" href="http://www.addtoany.com/share_save?linkname=&amp;linkurl=http%3A%2F%2Flocalhost%2Fwordpress%2F"><img src="http://static.addtoany.com/buttons/share_save_171_16.png" width="171" height="16" border="0" alt="Share/Bookmark"/></a><script type="text/javascript">a2a_linkname=document.title;a2a_linkurl="http://localhost/wordpress/";</script><script type="text/javascript" src="http://static.addtoany.com/menu/page.js"></script>