Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Autoload of striped images #95

Open
wants to merge 12 commits into
base: master
Choose a base branch
from
Open

Conversation

blschatz
Copy link
Contributor

@blschatz blschatz commented Jun 7, 2019

This patch fixes building on current MacOS and adds support for automatically loading dependent striped image containers that are stored in the same folder as a container that has been opened.

@CLAassistant
Copy link

CLAassistant commented Jun 7, 2019

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
2 out of 3 committers have signed the CLA.

✅ scudette
✅ blschatz
❌ Bradley


Bradley seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@scudette
Copy link
Collaborator

Still waiting on CLA for this. Please also check indentation - seems to be wrong in places.

@jtsylve
Copy link
Contributor

jtsylve commented Jun 12, 2019

I could be reading it wrong, but does this patch auto load any and all aff4 images in the same directory as the one the user is specifying? If so, this patch will break assumptions made by the c bindings. Currently in the C bindings the first AFF4 standard image is opened. If a user has multiple (unrelated) AFF4 images in the same directory, this patch will likely break that assumption in some cases.

I suggest we agree on a naming standard for split/striped images and limit the auto loading to images that fit that standard.

In the past we've discussed image.aff4 as the base image that users would select for loading and image.aff4.%u where %u is a segment number.

Thoughts?

@scudette
Copy link
Collaborator

This seems to be a convenience patch at best - it just saves a user from adding a * after the file name - i.e. aff4imager -e foobar image.aff4.*.

Do we really care that much? It would not be a good idea if it went off and automatically loaded all the images in the same directory for no reason.

@blschatz
Copy link
Contributor Author

The purpose of the patch is to enable any consumer of the c-aff4 library to open the Striped Canonical Reference Images.

The patch doesn't load all images in the folder. This resolution mechanism is only executed when an open Map contains references to unresolved image Streams, so is not even executed for single container images; for striped images it is very lightweight. What it does is quickly build a map of VolumeID to Container files by scanning the first 1k or so of each AFF4 file in the same folder.

This is the easiest way to ensure interoperability with the Standard. Without the patch, consumers need specify all additional containers required to materialise the Map.

Joe, I agree that creating a standardised naming scheme for striped/multi part images is a good idea moving forward - more from a UX perspective than from a technical one.

@blschatz
Copy link
Contributor Author

Michael,
The spacing related commits make the relevant lines consistent with the wider indent style. The majority of the surrounding code uses spaces, whereas those few lines used tabs.

@jtsylve
Copy link
Contributor

jtsylve commented Jun 13, 2019

This project benefit from the use of a .clang-format file or similar to deal with the formatting issues consistently.

@@ -248,8 +248,7 @@ AFF4Status BasicImager::handle_aff4_volumes() {
AFF4Flusher<AFF4Stream>(backing_stream.release()),
volume));

volume_objs.AddVolume(
std::move(AFF4Flusher<AFF4Volume>(volume.release())));
volume_objs.AddVolume(std::move(AFF4Flusher<AFF4Volume>(volume.release())));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the std::move from this call, it's not needed and will stop automatic copy elision

@@ -46,7 +46,7 @@ class FileBackedObject: public AFF4Stream {
// The filename for this object.
std::string filename;

explicit FileBackedObject(DataStore* resolver): AFF4Stream(resolver), fd(0){}
explicit FileBackedObject(DataStore* resolver, std::string filename): AFF4Stream(resolver), filename(filename), fd(0){}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is unnecessary as we have a factory function below. You should never need to create an instance of this type without a unique ptr to manage it.

#include "spdlog/spdlog.h"

namespace aff4 {

#define UNUSED(x) (void)x

std::string aff4_sprintf(std::string fmt, ...);
std::string aff4_sprintf(std::string fmt, ...);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please indent this and below back. A namespace definition does not increase indentation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want me to de-indent all of your code too? All the existing code is at a 4 space indent.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://google.github.io/styleguide/cppguide.html#Namespace_Formatting

The contents of namespaces are not indented.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My mistake. Apologies for the unwarranted whitespace change.

stream_urn, type_str);
return STATUS_OK;
}
AFF4Status VolumeGroup::GetStream(URN stream_urn, AFF4Flusher<AFF4Stream> &result) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please indent back


namespace aff4 {

void VolumeGroup::AddVolume(AFF4Flusher<AFF4Volume> &&volume) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please indent back here and below.

ScanForAFF4Volumes(path);
}
// Check known volume URNs.
if (foundVolumes.find(urn) != foundVolumes.end()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code looks the same as above - refactor?

@@ -327,9 +327,8 @@ FileBackedObject::~FileBackedObject() {
std::string mode,
AFF4Flusher<FileBackedObject> &result
) {
auto new_object = AFF4Flusher<FileBackedObject>(new FileBackedObject(resolver));
auto new_object = AFF4Flusher<FileBackedObject>(new FileBackedObject(resolver, filename));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just use the factory function


if((volumes->GetStream(URN(line), target)) != STATUS_OK){
// Search in the repository for this target.
resolver->logger->info("Attempting to locate resource {} ", line);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually do not think this is a good architectural choice - we are breaking down the encapsulation between classes and assuming a particular context. This makes the code more fragile.

I think it is a mistake to lump volume discovery together with the volume group - volume groups are there to manager lifetimes and they currently do not assume a particular volume implementation (like zip files stored in file backed objects). It is the imager which is aware of the details and it should do the right thing.

IMHO the way this should work is - you try to open the map at which point the map object checks that all the named targets are valid. If any are not known it should return an error status to that effect. The imager should then handle the error if it wants and it makes sense by scanning the volumes near the one we started with. Remember that sometimes the volume may not even be stored on a filesystem and it does not make any sense to try to open it here.

@@ -153,6 +153,7 @@ AFF4_Handle* AFF4_open(const char* filename, AFF4_Message** msg) {
};

h->volumes.AddVolume(std::move(zip));
h->volumes.AddSearchPath(filename);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

search path is not related to the volume group (volumes have no concept of paths). This should be a separate helper object or function.

std::set<std::string> searchPaths;
std::unordered_map<URN, std::string> foundVolumes;

bool FoundVolumesContains(const std::string& filename);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please separate these into another class and use composition.

@@ -1035,6 +1035,160 @@ AFF4Status ZipFile::StreamAddMember(URN member_urn, AFF4Stream& stream,
}

return STATUS_OK;
}

std::string ZipFile::GetResourceID(std::string& filename, std::shared_ptr<spdlog::logger> logger) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this method use the instance at all? Is it just basically a static method?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants