aboutsummaryrefslogtreecommitdiff
path: root/TODO.md
diff options
context:
space:
mode:
Diffstat (limited to 'TODO.md')
-rw-r--r--TODO.md19
1 files changed, 17 insertions, 2 deletions
diff --git a/TODO.md b/TODO.md
index 23454f7..496bf10 100644
--- a/TODO.md
+++ b/TODO.md
@@ -1,9 +1,24 @@
priority 1
----------
-- delta encode chunks
-- match stream against chunks from itself
+- add deltaEncode chunks function
+ - do not merge consecutive smaller chunks as these could be stored as chunks if no similar chunk is found. Thus it will need to be of `chunkSize` or less. Otherwise it could not be possibly used for deduplication.
+ ```
+ for each new chunk:
+ find similar in sketchMap
+ if exists:
+ delta encode
+ else:
+ calculate fingerprint
+ store in fingerprintMap
+ store in sketchMap
+ ```
- read from repo
+ - store recipe
+ - load recipe
+ - read chunks in-order into a stream
+- properly store informations to be DNA encoded
priority 2
----------
- use more the `Reader` API (which is analoguous to the `IOStream` in Java)
+- refactor matchStream as right now it is quite