As we witness the advance of machine learning and artificial intelligence, we cannot but marvel at the value that can be extracted from the massive volumes of business data that organizations accumulate over the years. But accumulate may not be the right word here. To gain any insight from this data, it’s not enough to let it aggregate and lie scattered around devices, networks, PST files and systems. Instead, companies are turning to data archiving solutions for a more organized approach.
While data is rapidly becoming the new gold, organizations are starting to recognize the importance of strategic data archiving. So let’s dive into the definition of data archiving and explore its advantages, along with some best practices of data retention.
What Is Data Archiving?
Data archiving, sometimes referred to as Enterprise Information Archiving (EIA), is the process where an organization creates a long-term archive of their structured and unstructured communications data for reasons like compliance requirements, lawsuit management, storage reduction, business intelligence and information governance.
Data archiving can be viewed as an integral part of record management – the handling of corporate records throughout the lifecycle and treating them as evidence of a company’s activities. Although organizations initially started archiving to meet compliance data archiving requirements, a lot has changed in recent years. This defensive approach has been replaced by a more proactive one, where archives are used not only to prove compliance and protect companies from litigation and failed audits, but rather as repositories of vast knowledge and business intel.
Email was the first type of electronic communications data to be included in data retention requirements outlined in a major US law. When the seminal Sarbanes-Oxley Act was passed, it was the first piece of legislation that mandated the retention of electronic communications alongside paper records and specified the retention period of 5 years.
As years went by and electronic channels used for business communication multiplied, data archiving branched out to include many other data types. Here’s a list of most commonly archived communication records today:
- Email (with attachments)
- Social media channels (Facebook, Twitter, Instagram etc.)
Internal and external collaboration platforms (Teams, Zoom, Slack, Meet etc.)
- Instant messaging platforms (e.g. WhatsApp)
- Mobile calls, voicemail and text messages
What are major data archiving benefits?
Why is data archiving important? Data archives can help with several major aspects of business operations:
Meet compliance requirements
From the compliance perspective, running a school or a government department is like running a business. For this reason, compliance remains the number one reason to archive data for those organizations that operate in an industry that is regulated, where strict laws control the retention of electronic records and specify strict retention periods during which the records need to be kept and readily available.
Complying with industry standards for retaining records is of utmost importance for most organizations. Communication channels like email are one of the primary sources of business information – so ensuring that they are retained for the proper length of time is crucial. Industries such as education, finance, healthcare, and professional services have strict regulations on how data must be stored and the specific length of time it must be retained.
Are more mandates coming? We can safely assume that the answer is yes, given that organizations (both enterprises and government departments) face significant challenges with exponential growth of data and that the number of channels used for communication is increasing, further complicating the picture. While some countries (the US) and industries (financial, healthcare) are more regulated than others, data archiving is on the way to become the norm globally.
|Related: Email Retention Policy Best Practices|
Email and chat records contain business insight, but they are also filled with information that can be used as evidence in various kinds of legal cases – employee disputes, discrimination claims, evidence of fraud or embezzlement etc. Because of the pandemic, employers may have even more workplace lawsuits and employee relations matters to resolve than usual.
Data archiving systems serve as tremendous support in electronic discovery – especially in the Early Case Assessment – the process in which vast sets of data are searched, reviewed and presented to lawyers in order to determine whether the organization should proceed with litigation or propose a settlement.
Until recently, roughly 70% of ediscovery costs was paid to legal counsel and service providers that were outsourced. Right now, the trend is that there are fewer third parties involved and more work is being done under the organization’s own roof in an attempt to control litigation costs by deploying ediscovery and data archiving solutions.
This means that the internal legal and IT teams need to obtain at least some kind of ediscovery expertise. At the same time, communication channels have multiplied and become more complex, so teams are no longer looking only at structured data like email.
More and more discoverable data is the so-called “dark data”, unstructured data that comes from non-traditional data sources like mobile devices, social media channels, video and audio files. And it is precisely that kind of data that accounts for 27% of all data generated in 2020.
Without an all-encompassing data archiving solution, this information is typically scattered across various servers, devices, PST files and your entire organization. It’s widely dispersed, which makes it very difficult to control and use proactively.
In addition to that, employees increasingly create, access and manage business information from personal devices. And even the most compliance or ediscovery-conscious people make mistakes when generating or deleting data on BYOD devices.
|Related: Ediscovery and Archiving: Is Your Organization Ready?|
Reduce storage load and costs while increasing productivity
Studies have shown that about 75% of a company’s intellectual property is embedded in their email and instant messaging services. This data is incredibly important to protect – but it can cause an intense overload on the servers that store your data, especially when we know that storage requirements are growing.
Storing email data and archiving it on servers will undoubtedly reduce performance and speed. When deleting is not an option because regulatory compliance requires long retention periods, data archiving is the answer, as it allows organizations to store messages safely on an off-site server or in the cloud. Additional options include auto-removal of duplicate messages as well as advanced compression techniques that eliminate the strain on servers by more than 50%.
On top of that, relieving the servers of additional data to manage allows your employees to work more quickly and increases the speed at which everyone can access their own stored emails and messages. Archive data storage solutions are more cost-effective and simultaneously decrease the load on your IT department, as employees will no longer need to bother them with assistance when backing up their email inboxes or looking for old, deleted or misplaced files and communication records.
|Related: Email Archiving Benefits: 20 Reasons to Archive Email|
Data archiving vs.backup: What is the difference?
Backups are typically snapshots of the entire system, so it is difficult (if not impossible) to single out individual items for long-term retention, and segregate them to apply different retention policies based on the importance of data or different departments. Data archiving captures data in near real-time, indexes the items with complete metadata and moves it to a new location, making it much easier to retrieve specific records.
Another key difference is that the purpose of backups is to serve as disaster recovery mechanisms. Data archives, on the other hand, are intended for long-term storage and active use of data.
|Related: The Difference Between Email Archiving and Email Backup|
What is a data archival strategy?
Like any new business process, data archiving should be approached strategically, by examining the relevant data retention regulations, creating a data archiving policy and paying attention to data protection and privacy. Here are the main points to consider when creating a data retention policy:
- Analyze the regulations that apply to your vertical and the location(s) of your business and make sure your data archiving policy follows the retention periods outlined in the laws.
- Make sure to destroy the records after the retention windows expire. Another great benefit of data archiving tools is that the retention can be automated, which means that all items under a single policy can be automatically deleted from the system once the retention periods expire.
- Avoid retaining all your data indefinitely, as this creates liability and increases the amount of time required to locate the data while searching. Some data, however, will need to be archived for longer periods of time or forever – this can include C-level management records, email records detailing important business decisions, employment records etc.
- Pay attention to the security of your data archiving solution. Storage does not equal security by default when it comes to data. Archives contain all kinds of sensitive information: financial data or personally identifiable information about your customers and employees. Look for a data archiving services provider that has the right technology and expertise and promises a near-perfect uptime, as well as security and data privacy certifications and a strong service level agreement.
- Include multiple departments in the creation of your data archiving strategy – it’s best if your legal/compliance, IT and HR teams collaborate on the retention policy, since different departments typically have different (and sometimes opposing) needs and could advocate for different retention periods.
- Take the time to explore different types of data archiving software solutions and make sure to have a demo or do a POC in order to test the solution. If you need to archive multiple data types (e.g. email, social media and mobile messages), look for vendors that can provide all the services. That way you’ll centralize your data archiving and ediscovery and be able to manage all your communication channels from a single system.
Data archiving software: What features to look out for?
In order for a data archiving solution to meet your expectations in terms of compliance and lawsuit management, there are some critical features and points to consider and include in your data archiving plan:
- Deployment options – There is no single best way to archive data, as different deployments have different advantages and disadvantages. Enterprises with large IT departments may prefer and feel more comfortable using on-premise solutions which allow them more control of the archiving process. Small and medium-sized organizations, in contrast, may lack such capacities and typically choose cloud-based data archiving platforms or a hybrid environment. In case the organization decides to use a hardware solution, it’s crucial to ensure that it relies on fault-tolerant technologies to prevent disk failures and potential data loss and that it comes with good scaling options.
- Configurable retention policies – Granular retention policies and the ability to schedule automatic deletion will allow easy policy management and minimize human error. While examining different data archiving tools (some of which come with different price/feature packages) be sure to ask if there is a limit to the number of retention policies that can be applied in order to avoid additional costs down the road.
- Ediscovery capabilities – In order for a data archiving solution to serve as an ediscovery solution, it needs to offer tamper-proof storage and archive data in a WORM format that fully preserves the evidentiary quality or the archived records. Robust search using data archiving standards like Boolean, wildcard, proximity and fuzzy operators, a large number of search criteria will ensure that you can pinpoint the exact matches in terabytes of archived files.
- Access controls – Employee self-service is extremely important in archive data software, but regular employees need to have restricted access to the archive and be allowed to view, search and manage their mailbox only. In contrast, legal and compliance teams, IT personnel and administrators need to be given more control over the archiving system. This is accomplished by assigning different roles and permissions to different user groups. Some archiving solutions offer only pre-set user roles, while others allow a lot more flexibility in both the number of roles that can be created and the permissions that are associated with a specific role. It’s also crucial for the data archiving software to offer a full access log (activity trail) so that the responsible teams can control user activity on the platform and check if anyone tried to misuse the system.
- Integration of historical data – Most organizations looking for a data archiving solution already have volumes of historical data that needs to be preserved but might be housed in another system or disparate systems. An efficient data archiving software will allow you to migrate your existing data from a legacy or competitor system without major issues and preserving the data integrity.
- Technical support – Technology never gets the job done alone. You need a reputable archiving vendor whose technical expertise is not the only thing they can offer. When you need to retrieve data fast and something goes wrong, you need a trusted partner whose technical support team can assist you 24/7. In addition to that, opting for a vendor that has a dedicated, in-house tech support team means that you’ll be spending less time on onboarding and training and zero time maintaining and managing the software.
|Jatheon Cloud is a secure, AWS-based data archiving platform which allows organizations to capture communications data like email, Facebook, Twitter, WhatsApp, mobile calls and text messages with all metadata, including edits and deletions. It also lets relevant departments (legal, IT, compliance) access, search, manage and produce these records for multiple use cases – as a simple reminder, HR inquiries, legal investigations, productivity tracking, and as part of various compliance obligations.If you’d like to learn more about data archiving with Jatheon or test the Jatheon Cloud platform, contact us or schedule a no-commitment, 20-minute product demo.|
What is the difference between data storage and data archiving?
The main difference between data storage and archiving is how they handle data. Data storage focuses on active data being used regularly for various operations while data archiving focuses on long-term retention of data that isn’t actively being used, but needs to be kept for compliance, legal, or other purposes. Archived data is also usually transported from the main data storage unit to a separate archive to make active data easier to manage.
Why is archiving better than deleting?
Unlike deleting, which gets rid of data permanently, archiving lets you keep a record of all past data and gives you access to search for specific information you are looking for. This ensures compliance with federal and state regulations without the need to clutter your active storage systems, improving your resource management efficiency.
What files should be archived?
Archiving should be done with any files of critical business importance like legal contracts, financial statements, email, text, and social media communication, and any data subject to regulatory compliance requirements.
Who is responsible for archiving?
Archiving responsibility in most cases falls on the organization that generates the data itself. This includes businesses, government agencies, and institutions. These organizations usually have a dedicated person or team responsible for archiving and keeping their data safe with the use of their internal database or an external archiving system.