Wandering Key
Technology
Introduction
A sensitive digital data which is either stored or transferred requires protection, and a most common approach to protect such data is to encrypt it with a best solution possible, which in turn would guarantee the data safety and security for years to come. While the modern encryption algorithms do offer quite a good data protection they do share some common weaknesses, and that weaknesses is the encryption keys and initialization vectors themselves, which do require protection as well in order to avoid any security leaks. Protecting of existing encryption keys might lead to complex and expensive solutions overall as an architectural design of information systems would require to rely on additional software and hardware components for Key Management. Usually the actual encryption keys are being stored encrypted as well on separate servers with restrictive access to them. Such design increases the complexity of the data model and, at the same time, decreases the reliability and robustness of the system as any additional component is prone to break like any other one.
Today there are many companies on the market offering Key Management Solutions, but they all come down to the same standard approaches. It is either a Centralized Key Management as a most common solution with a Key Safe or a Vault dedicated to protect encryption keys or a Hierarchical (Onion) Architecture with a quite complex approach to access the keys.
Additionally the above mentioned approaches both share same weaknesses: the number of unique keys per object is limited and those keys should physically exist, hence do require protection.
In order to break the series of various key protection steps and to concentrate only on protection of actual data the obvious proposition would be to abandon all together storing of the encryption keys and initialization vectors and generate those only when necessary. Furthermore the number of generated unique keys per each entity or user would be based on particular system requirements and could consist of tens of millions or even billions of keys and IVs.
With this article we would lite to introduce the already developed and ready to use Wandering Key technology which offers an elegant and efficient solution for this puzzle. Depending on the requirements the automatic key generation can be presented as a module attached to a database server or a standalone application or a FUSE mount or an embedded solution, etc.
The Wandering Key technology allows to abolish the necessity of storing and therefore protecting by any means the actual symmetrical encryption keys and IVs while keeping the data safe and secure with existing encryption algorithms. Millions of unique keys associated with the object which initiated the process of either encryption or decryption are generated on-the-fly and not being stored anywhere. The technology can be applied to any field which requires protecting of sensitive data such as data warehousing, file systems, communications, etc.
When encrypting communications the following approaches are most commonly used: TLS and E2EE. TLS uses a man in the middle, usually a server through which the communication takes place. Typically the protocol implies the certification of parties and the use of asymmetric keys at the beginning of communication. E2EE communication is the direct exchange of encrypted data between parties. An example would be a digital radio or exchange of information between a drone and a control station. The communication with a drone is chosen here as a characteristic prototype, but it can be any communication that requires encryption. Common to these approaches is the fact that the set of keys is limited and it is typically One object–One key for the entire communication session.
With the Wandering Key technology the above mentioned limitations in communications can be lifted while keeping the communication channels safe and secure as a different encryption key can be applied with each packet sent.
Wandering Key Technology
The technology relies on clusters and entities to manipulate virtual keys. The clusters imply database clusters, organizations, individual group of people, etc which coalesced to form a cluster. The entities imply objects within the cluster, such as database users, colleagues within the same organization, electronic equipment within a unit, document folder, etc.
Within a cluster each entity maintains a set of unique virtual keys not shared with other entities. The clusters are separate independent units which determine all possible keys which might be generated for any given entity. By design the keys between different clusters do not intersect.
In general the quantity of generated keys for each entity is determined by the task requirement. In the provided example for the database engine the module will generate approximately 30M (33,554,431) of unique keys for each given entity, regardless of how many entities are present within the cluster.
For example, if to engage astronomical objects here to describe the technology then a cluster can be imagined as an independent galaxy with a myriad of stars and where each star has its own set of unique to that galaxy planets. In this example the star is an entity and the planet is the encryption key.
Needless to say that the generation of encryption keys by the Wandering Key technology is fundamentally different from the generation of session keys. Every entity has at disposal its own set of associated virtual keys which are generated on-the-fly as needed.
This picture represents a part of a database server architecture that includes the key generation module. The module itself is an extension of the database engine and it is an essential part of the server when it comes down to either encryption or decryption of data. This architecture is much simpler than the centralized or hierarchical like approach for storing keys, as it is no different from a typical server architecture. If to reiterate it again then the advantage of such architecture is that the encryption keys are not part of the data, not stored anywhere, therefore cannot be compromised.
When working with databases this approach allows not only to avoid the problem of key leakage, but also to reduce overall system administration and maintenance costs due to elimination of needless equipment dedicated to Key Vaults and such. Furthermore, this approach allows to reduce requirements for backup data storage centers, as the backups can be stored in public cloud services without resorting to expensive secure storage services.
Real World Testing
As it was mentioned above the number of keys generated for each entity is somewhat limited, and in current configuration for the database engine it consists of approximately 30M of unique keys. In real world applications the keys are not chosen sequentially as it is hard to imagine that there will be such a large number of records encrypted at once for a particular user/entity. Therefore the keys from the set are being chosen randomly and some issues caused by the random generator were expected.
To clarify a potential problem let's imagine that we have a set with only 10 keys available and want to choose 5 of those for encryption. In theory a pseudo random generator might yield a sequence of following numbers: 1, 11, 21, 31 and 41, from which we would determine that every time the key #1 should be used for encryption. Although in our case the set of keys is quite large, but the numbers produced by the random generator might exceed the number of keys in the set and the keys might be repeated. That is why the technology is called the Wandering Key as it is expected that once in a while the same key would be reused, but its usage would be unpredictable or in other words the key would wander.
The testing was carried out in Ubuntu Linux 22.04 LTS operating environment. The database server used for data processing was PostgreSQL 14.5. That server was functionally extended with the vlxaux.so key generation module to test the technology in a multi-user environment.
Key Wandering Test
Around 20 tests were performed with 10,000,000 generated keys for each one with the same entity for all tests. The goal of these tests was to find out how many possible repetitions of the same key might occur. All tests yielded similar results. The first 8 tests were chosen as examples for this article.
The results of the tests are presented on histograms shown, where the second histogram below depicts the same results in logarithmic scale. The following tables present more detailed information on the results.
What preliminary conclusions can be made? There was a small number of keys which were repeated 3 times each. All tests proved that the number of repetitions is negligible, no more than 3 repetitions of the same key per 10,000,000. Thus, on one hand we have some keys repeated, but on another hand the percentage of repetitions is minuscule and it is less then 0.0001%. The number of repetitions can be further minimized by increasing the set of generated keys, which is easy to achieve given the resources permit so. But whether it would be practical or not remains uncertain, as to generate about 30M keys for each entity requires 16Kb of static memory, and if to increase each set, for instance, to a billion of keys there will be a need for about of 500Kb extra, which might not be feasible on busy servers.
The table below represents the total quantity of keys generated for each test from requested 10,000,000 and from which it can be concluded that roughly 10,000 of keys were seen at least twice.
Test 1 | Test 2 | Test 3 | Test 4 | Test 5 | Test 6 | Test 7 | Test 8 |
---|---|---|---|---|---|---|---|
9990577 | 9990530 | 9990518 | 9990403 | 9990388 | 9990721 | 9990605 | 9990691 |
In the following table is the actual data yield by each test. The leftmost column represents the number of repetitions found. The numbers in columns represent the quantity of keys which were repeated. Obviously when the number of repetitions equals to 1 it means that for those keys there were no repetitions detected.
Test 1 | Test 2 | Test 3 | Test 4 | Test 5 | Test 6 | Test 7 | Test 8 | |
---|---|---|---|---|---|---|---|---|
1 | 9981160 | 9981069 | 9981043 | 9980812 | 9980783 | 9981451 | 9981215 | 9981392 |
2 | 9411 | 9452 | 9468 | 9585 | 9598 | 9261 | 9385 | 9289 |
3 | 6 | 9 | 7 | 6 | 7 | 9 | 5 | 10 |
4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
According to the last test there were 10 keys out of 10,000,000 repeated 3 times each. Below is the table showing the wandering of those keys. The numbers represent the positions where each key was seen.
Key 1 | Key 2 | Key 3 | Key 4 | Key 5 | Key 6 | Key 7 | Key 8 | Key 9 | Key 10 |
---|---|---|---|---|---|---|---|---|---|
411354 | 550613 | 4686641 | 5718739 | 3300740 | 2227363 | 5808486 | 498976 | 1827439 | 1923135 |
3467466 | 1534378 | 5340168 | 6827904 | 3622090 | 9698150 | 7940642 | 3387508 | 2944199 | 3222319 |
8318601 | 3658000 | 5641261 | 7594123 | 3783247 | 9946324 | 9770118 | 7363288 | 3363373 | 9346091 |
Since the tests were intentionally done for one single entity, it should be noted that some keys which were unique in one test were also seen in another while being unique in it as well. In other words the distribution among, say, 100,000,000 keys would be same as for 10,000,000 keys with the only difference that there will be more repetitions. In fact the combined set with 80,000,000 already generated keys yielded only one key repeated 5 times.
Test for Uniqueness of Generated Key Sets for Different Entities
For this test there were randomly chosen 50 entities and for each entity were generated 10,000,000 keys. The test passed with flying colors as not a single key was found to be common for that group of entities. Although such result was expected by design that quite laborious work was completed and all 1225 handshakes between 50 tables were made. To run more intensive tests does not seem necessary at this point, as technologically the keys cannot be shared between entities by design.
Performance of Key Generation Module
All tests including this one were performed on a machine with Intel i5-4690K CPU @ 3.50GHz processor. The average speed of generation was about 800,000 keys per second, from which it can be concluded that the module does not impede normal server operations.
Summary
Although the Wandering Key technology was initially developed for abolishing the storing of encryption keys within databases its possible applications go far beyond the database only practice. First of all for a normal use within a database cluster there is no practical need to have in hand tens or hundreds of millions of keys for each entity. Such necessity may arise for communication purposes or for encrypting of big file systems or for systems where the number of entities is limited. In such scenarios the number of keys per entity could be increased to billions to make sure that each sent packet is encrypted with a different key. For file systems the key could be changed for every encrypted page, this would prevent the necessity of breaking files apart and scatter those parts across the network, as it happens in today's practice.
Demonstration
The vlxaux.so extension of sqlite3 database engine was developed for demonstration purposes and it is available for download as a ready to use package. The extension works only in Linux environment and on x86_64 machines. The module utilizes AES, CAMELLIA and ARIA symmetric block cipher algorithms in CBC and CTR modes to either encrypt or decrypt data, and it is based on mbedTLS encryption library. Furthermore, the module incorporates other useful functionality such as various compression algorithms for more efficient storing of large data blocks.
The efile utility is another powerful tool utilizing the Wandering Key technology is included for demonstration purposes and it can be used as a general purpose utility to encrypt and decrypt files. It is available for download as a package from here. Needless to say that it is also limited to work only in Linux environment and on x86_64 machines.