aboutsummaryrefslogtreecommitdiff
path: root/TODO.md
diff options
context:
space:
mode:
authorn-peugnet <n.peugnet@free.fr>2021-09-01 19:07:35 +0200
committern-peugnet <n.peugnet@free.fr>2021-09-01 19:07:35 +0200
commitdb40818ef79ccb3f5f9232623f57ad284a4af7d0 (patch)
tree6b1b1a7b6169eb19f6ca17ff87f1075b41bd513f /TODO.md
parent34a84a44b4dfa513d8ceb1cfeec50ac78fb311e0 (diff)
downloaddna-backup-db40818ef79ccb3f5f9232623f57ad284a4af7d0.tar.gz
dna-backup-db40818ef79ccb3f5f9232623f57ad284a4af7d0.zip
move some consts into repo
Diffstat (limited to 'TODO.md')
-rw-r--r--TODO.md19
1 files changed, 17 insertions, 2 deletions
diff --git a/TODO.md b/TODO.md
index 23454f7..496bf10 100644
--- a/TODO.md
+++ b/TODO.md
@@ -1,9 +1,24 @@
priority 1
----------
-- delta encode chunks
-- match stream against chunks from itself
+- add deltaEncode chunks function
+ - do not merge consecutive smaller chunks as these could be stored as chunks if no similar chunk is found. Thus it will need to be of `chunkSize` or less. Otherwise it could not be possibly used for deduplication.
+ ```
+ for each new chunk:
+ find similar in sketchMap
+ if exists:
+ delta encode
+ else:
+ calculate fingerprint
+ store in fingerprintMap
+ store in sketchMap
+ ```
- read from repo
+ - store recipe
+ - load recipe
+ - read chunks in-order into a stream
+- properly store informations to be DNA encoded
priority 2
----------
- use more the `Reader` API (which is analoguous to the `IOStream` in Java)
+- refactor matchStream as right now it is quite