Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix/TS expiration handling without network #2937

Merged
merged 7 commits into from
Sep 23, 2024

Conversation

carpawell
Copy link
Member

No description provided.

@carpawell carpawell self-assigned this Sep 12, 2024
@carpawell carpawell changed the title Fix/ts expiration handling without network Fix/TS expiration handling without network Sep 12, 2024
Copy link

codecov bot commented Sep 12, 2024

Codecov Report

Attention: Patch coverage is 58.51064% with 39 lines in your changes missing coverage. Please review.

Project coverage is 23.88%. Comparing base (4a1bc79) to head (3e8cc99).
Report is 25 commits behind head on master.

Files with missing lines Patch % Lines
pkg/local_object_storage/metabase/graveyard.go 55.88% 11 Missing and 4 partials ⚠️
pkg/local_object_storage/metabase/version.go 62.50% 6 Missing and 3 partials ⚠️
pkg/services/object/put/local.go 0.00% 4 Missing ⚠️
pkg/local_object_storage/shard/gc.go 57.14% 2 Missing and 1 partial ⚠️
cmd/neofs-lens/internal/meta/list-graveyard.go 0.00% 2 Missing ⚠️
cmd/neofs-node/object.go 0.00% 2 Missing ⚠️
pkg/local_object_storage/shard/control.go 50.00% 1 Missing and 1 partial ⚠️
cmd/neofs-node/storage.go 0.00% 1 Missing ⚠️
pkg/local_object_storage/metabase/iterators.go 0.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2937      +/-   ##
==========================================
- Coverage   23.90%   23.88%   -0.03%     
==========================================
  Files         776      776              
  Lines       45708    45893     +185     
==========================================
+ Hits        10928    10961      +33     
- Misses      33920    34070     +150     
- Partials      860      862       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@carpawell carpawell marked this pull request as ready for review September 12, 2024 18:44
CHANGELOG.md Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
@carpawell carpawell force-pushed the fix/TS-expiration-handling-without-network branch from 8831c47 to 8314653 Compare September 12, 2024 21:33
It takes too much of network/disk to handle tombstones expiration and removal,
so it is better to store it when tombstone is being indexed in metabase.
Additional little-endian expiration suffix was added to graveyard values.
Metabase version was increased as it is a non-compatible change. Relates #2929.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
They do not differ from the other objects (e.g. locks do). Initial logic has
changed much, graveyard now allows to handle expired tombstones marks (do not
confuse it with the lists of regular indexes) independently, while disk can be
cleared with the other types of object. Also, add tests for it.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
@carpawell carpawell force-pushed the fix/TS-expiration-handling-without-network branch from 8314653 to 152150e Compare September 12, 2024 21:34
Copy link
Contributor

@cthulhu-rider cthulhu-rider left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i dont think it's good to separate commit dropping old implementation from the one changing the approach. Although red diff is pretty, transient state with dead code makes no sense

pkg/local_object_storage/metabase/graveyard.go Outdated Show resolved Hide resolved
pkg/local_object_storage/metabase/graveyard.go Outdated Show resolved Hide resolved
var counter int

err := db.boltDB.Update(func(tx *bbolt.Tx) error {
bkt := tx.Bucket(graveyardBucketName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if bkt == nil?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also not expected on an inited metabase:

string(graveyardBucketName): {},

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clear panic

Copy link
Member Author

@carpawell carpawell Sep 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this operation is a process of Inhume and none of the methods in inhume.go does this check and panics. this is important to me cause it is literally "put" and then "get" with checking if the result is not nil, not sure bbolt code should be like this; and if we want this to be paniced with a desc, we have to add a desc in many more places

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as i said in #2937 (comment), we need to dedicate a new issue for this then. Could u pls create it?

Copy link
Member Author

@carpawell carpawell Sep 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you describe the problem so I can create it? as i have already said, panic is not expected at all. data is versioned and/or inited explicitly, any changed database data should be considered as a strange and not-possible-to-handle situation. panic with trace or panic with trace and two more words almost don't differ to me, dev will have to understand the whole situation anyway, and user will understand nothing. if values from bbolt should be expected corrupted, we have to change many many places. considering all the said above, what do you want to be attached to a new issue?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can only happen with corrupted DB. Corrupted meta DB -> resync meta please. But making it a proper error in all cases won't be easy.

pkg/local_object_storage/metabase/graveyard.go Outdated Show resolved Hide resolved
pkg/local_object_storage/metabase/version.go Show resolved Hide resolved

for k, v := c.First(); k != nil; k, v = c.Next() {
newVal := make([]byte, addressKeySize, addressKeySize+8)
copy(newVal, v)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imo we should assert len(v) == addressKeySize

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added but i dont know, it is kinda we need a check in every code that takes any value from db that was not created by this exact code (if fact any start except a start from scratch)
returning an error now, not sure what else can be done here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

returning an error now, not sure what else can be done here

some error is at least better than silence. When we expect strict value format, and it's unintentionally violated in fact, app should react on its own - remove the shard from the game or completely shutdown. Otherwise, further runtime may be unsafe in relation to stored data

pkg/local_object_storage/metabase/version.go Show resolved Hide resolved
Graveyard now has tombstone expiration marks in epochs, there is no need to use
any network requests, just drop records if an epoch is big enough.
Closes #2929.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
It may or may not index tombstone's expiration in graveyard.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
Migration can be done automatically, let admin's life be a better thing. TS
expiration is taken as the current epoch + 5. It is not critical if TS will live
more. In practice, side effects _may_ be seen only the first 5 hours after an
update, and only returned status code _may_ differ.

Signed-off-by: Pavel Karpy <carpawell@nspcc.ru>
@carpawell carpawell force-pushed the fix/TS-expiration-handling-without-network branch from 817823d to 3e8cc99 Compare September 23, 2024 11:29
@carpawell
Copy link
Member Author

transient state with dead code makes no sense

It separates functional changing and unused code deletion. Overall, I agree with you but in this PR IMO: it simplified review; it separated changing GC logic in storage and dropping the whole package from object_manager dir, so there were no clear "old" and "new" ones. @roman-khimov, interested in our opinion.

@roman-khimov
Copy link
Member

roman-khimov commented Sep 23, 2024

transient state with dead code makes no sense

It separates functional changing and unused code deletion

OK both ways to me (since the node works fine between commits anyway). Maybe with slight preference to deletion in the same commit as the code becomes redundant. If I'm to think about the way I'd commit it --- it'd be a single commit.

@roman-khimov roman-khimov merged commit a0447ce into master Sep 23, 2024
21 checks passed
@roman-khimov roman-khimov deleted the fix/TS-expiration-handling-without-network branch September 23, 2024 13:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants