Skip to content

Conversation

@Ethanlm
Copy link
Contributor

@Ethanlm Ethanlm commented Jul 10, 2017

This adds an interface and two implementations, one that will load from a local file, and another
that will load a config from artifactory.

It also modifies the ResourceAwareScheduler and Multitenant schedulers to have a plugin interface and configuration entries for the plugin.

Unit tests are also added for the implementations.

See: #1785

Also fixed some checkstyle issues.

@Ethanlm
Copy link
Contributor Author

Ethanlm commented Jul 17, 2017

Hi @HeartSaVioR @revans2, could you please review this when you get a chance? Thanks very much!

Copy link
Contributor

@HeartSaVioR HeartSaVioR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Ethanlm Sorry to visit too late. This is a bit huge and I often lost my concentration.
Left some comments but looks good overall. The blocker thing is that I have no idea what artifactory is.
Is it internal proprietary? If then it should be better to generalize (or create another one) ArtifactoryConfigLoader to accept generic http server implementation. OK to not support artifact's' feature in http server implementation.

@revans2 What do you think?


@SuppressWarnings("rawtypes")
private Map _conf;
private int _artifactoryPollTimeSecs = 600;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: Maybe better to have a constant (private static final) for 600.

private boolean _cacheInitialized = false;
// Location of the file in the artifactory archive. Also used to name file in cache.
private String _localCacheDir;
private String _artifactoryScheme = "http";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: Maybe better to have a constant (private static final) for http.

// Location of the file in the artifactory archive. Also used to name file in cache.
private String _localCacheDir;
private String _artifactoryScheme = "http";
private String _baseDirectory = "/artifactory";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: Maybe better to have a constant (private static final) for /artifactory.

private String _artifactoryScheme = "http";
private String _baseDirectory = "/artifactory";
private int _lastReturnedTime = 0;
private int _timeoutSeconds = 10;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: Maybe better to have a constant (private static final) for 10.

HttpEntity entity = response.getEntity();
return entity != null ? EntityUtils.toString(entity) : null;
} else {
LOG.error("Got unexpected response code {}", status);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe better to include the content of entity to error log as well if the response has an entity even though it was not successful.

tmpDirPath = Files.createTempDirectory("TestArtifactoryConfigLoader");
File f = tmpDirPath.toFile();
f.mkdir();
File dir=new File(f, "nimbus");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

both spaces around =

dir.mkdir();
}

private void recursiveDeleteFile(File dir) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could just delete it with FileUtils.deleteDirectory() if we're relying on Apache commons-io now. If not it's OK to keep this as it is.

}

@Test
public void testInvalidConfig() throws Exception {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like redundant with testInvalid().

fw.close();

Config config = new Config();
config.put(ArtifactoryConfigLoader.ARTIFACTORY_URI, "file://"+temp.getCanonicalPath());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

both spaces around +


Assert.assertNotNull("Unexpectedly returned null", result2);

// Shouldn't change yet
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove comment.

@Ethanlm
Copy link
Contributor Author

Ethanlm commented Aug 19, 2017

@HeartSaVioR Thanks for the review. I will address your comments ASAP but it will take me a little longer to catch up this PR

@Ethanlm
Copy link
Contributor Author

Ethanlm commented Aug 22, 2017

@HeartSaVioR Thanks for the comments! I rebased the branch and addressed some of your comments. I am still working on the rest of them

@Ethanlm
Copy link
Contributor Author

Ethanlm commented Aug 24, 2017

I refactored the code. It simplifies configurations a lot. getConfigLoader() will choose IConfigLoader implementation based on the scheme of scheduler.config.loader.uri. It doesn't require users to set many parameters.

FileConfigLoader will be used if file:// scheme is used; ArtifactoryHttpConfigLoader if artifactory+http://. So we can implement HttpConfigLoader for normal http scheme (TO DO).

Doing some more testing, especially for ArtifactoryHttpConfigLoader. HttpConfigLoader is to be implemented. Hope to get some suggestions first if any

@HeartSaVioR
Copy link
Contributor

HeartSaVioR commented Aug 25, 2017

@Ethanlm
I would like to be clear about Artifactory to continue reviewing. I have no idea what is this, but whether it is open specification or proprietary on specific company, if we can provide it to users, we don't need to consider creating HttpConfigLoader. And if we need to define some interfaces for HttpConfigLoader and have to provide the implementation, maybe we could just implement Artifactory and let users spin and use it.

Please mention with cc. to proper folk(s) when you think you couldn't answer this question.

@Ethanlm
Copy link
Contributor Author

Ethanlm commented Aug 25, 2017

@HeartSaVioR You may want to check with this website. My understanding is that we can get most recent file from artifactory server. It's not the same with normal http server.

@HeartSaVioR
Copy link
Contributor

@Ethanlm Thanks for the information. Looks like we can't expect users to use artifactory. If possible we'd be better to have HttpConfigLoader implementation, but may be not easy to implement same for artifactory like pulling most recent file. We can start with just getting full URL for the config file, and pull it periodically.

@Ethanlm
Copy link
Contributor Author

Ethanlm commented Aug 25, 2017

@revans2 Do you have any suggestions? Thanks.

Copy link
Contributor

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall it looks really good. Just 2 comments about how I think we can make it more maintainable/extensible in the future.

String scheme = uri.getScheme();
switch (SchemeType.toSchemeType(scheme)) {
case FILE:
return new FileConfigLoader(loaderParams);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me it feels like this would be a great place to use a ServiceLoader instead of having a hard coded FileConfigLoader or ArtifactoryHttpConfigLoader.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have been thinking about using a ServiceLoader and tried so. But I realized that we actually want to have only one type of loader working in the mean time (one loader for one type of scheme). So I think switch-case here seems cleaner.

The way I can think of applying ServiceLoader here is to have every type of loaders check the scheme in their load() function and then either directly return null (because they don't deal with this type of scheme) or return the result. Something like:

For HttpConfigLoader

Map load(loaderParams) {
    scheme = getScheme();
    if (!scheme.equalsIgnoreCase("http")) {
        return null;
    }
    //else: process and return result;
}

For ArtifactoryHttpConfigLoader,

Map load(loaderParams) {
    scheme = getScheme();
    if (!scheme.equalsIgnoreCase("artifactory+http")) {
        return null;
    }
    //else: process and return result;
}

Is this what you are talking about? Sorry if I mis-understood it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I was thinking we might want to use a factory pattern so we can cache connections and other things in the IConfigLoader implementation. Something perhaps like.

public interface IConfigLoaderFactory {
    IConfigLoader createIfSupported(URI uri, Map<String, Object> conf);

    private static ServiceLoader<IConfigLoaderFactory> serviceLoader = ServiceLoader.load(IConfigLoaderFactory.class);
    
    public static IConfigLoader open(URI uri, Map<String, Object> conf) {
        for (IConfigLoaderFactory factory: serviceLoader) {
            IConfigLoaderFactory ret = factory.createIfSupported(uri, conf);
            if (ret != null) {
                return ret;
            }
        }
        throw some exception();
    }
}

public class HttpArtifactoryServiceLoaderFactory implements IConfigLoaderFactory{
    public IConfigLoader createIfSupported(URI uri, Map<String, Object> conf);
        if (!"artifactory+http".equals(uri.getScheme())) {
            return null;
        }
        return new ...;
    }
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a great idea to me. I will try to do so. Thanks

* A dynamic loader that can load scheduler configurations for user resource guarantees from Artifactory (an artifact repository manager).
*/
public class ArtifactoryHttpConfigLoader implements IConfigLoader {
protected static final String ARTIFACTORY_BASE_DIRECTORY = "artifactory.config.loader.base.directory";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have configs in here can we tag them with the config validation annotations?

@isSting

And also register it with the config service loader?

import org.apache.storm.validation.Validated;

...

ArtifactoryHttpConfigLoader implements IConfigLoader, Validated {

and then append org.apache.storm.scheduler.utils.ArtifactoryHttpConfigLoader to ./storm-server/src/main/resources/META-INF/services/org.apache.storm.validation.Validated


String SCHEDULER_CONFIG_LOADER_URI = "scheduler.config.loader.uri";
String SCHEDULER_CONFIG_LOADER_POLLTIME_SECS = "scheduler.config.loader.polltime.secs";
String SCHEDULER_CONFIG_LOADER_TIMEOUT_SECS = "scheduler.config.loader.timeout.secs";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here too can we annotate these with config validation annotations, and make IConfigLoader part of the Config Validation service loader?

@HeartSaVioR
Copy link
Contributor

@Ethanlm
I guess you're handling some different issues concurrently. (at least this and locality aware shuffle grouping)
If you're not currently working on HttpConfigLoader, I'm OK to file a new issue to address HttpConfigLoader, and keep my +1.

@Ethanlm
Copy link
Contributor Author

Ethanlm commented Aug 28, 2017

@HeartSaVioR Yes, please. But this patch is not ready yet. I am going to do some tests manually on ArtifactoryHttpConfigLoader and also add some comments to the source code. Thanks.

(I tested FileConfigLoader manually)

@HeartSaVioR
Copy link
Contributor

@Ethanlm Sure, please leave a comment when you are done. I'll take a look at further changes too.

this.conf = conf;
schedulingPrioritystrategy = (ISchedulingPriorityStrategy) ReflectionUtils.newInstance(
(String) conf.get(DaemonConfig.RESOURCE_AWARE_SCHEDULER_PRIORITY_STRATEGY));
(String) conf.get(DaemonConfig.RESOURCE_AWARE_SCHEDULER_PRIORITY_STRATEGY));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason why we need to change the spacing in many places in this file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No.. It just accidentally happened..

@Ethanlm
Copy link
Contributor Author

Ethanlm commented Aug 28, 2017

I am done with testing and I think this patch is ready for review. Now I have ArtifactoryConfigLoader supports both artifactory+http and artifactory+https schemes.

@HeartSaVioR @revans2 @jerrypeng Could you please let me know if you have any suggestions on this patch when you get a chance. Thanks very much!


I want to explain a bit about Artifactory and ArtifactoryConfigLoader a bit if you are not familiar with the idea.

To use ArtifactoryConfigLoader, we can set the uri pointing to a directory or a file on Artifactory server.

We can query the REST API api/storage to get the metadata of the artifact. If the URI (set at scheduler.config.loader.uri ) points to a directory, we can get the metadata like this:
image

We have children which are the files under this directory and ArtifactoryConfigLoader will pick the file with the largest lexographic name.

It the URI points to a file, the metadata will be like this:
image

where we can get the "downloadUri" and then download that file.

Copy link
Contributor

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is looking good.

<dependency>
<groupId>com.google.auto.service</groupId>
<artifactId>auto-service</artifactId>
</dependency>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: reading through the documentation for this it indicates that we should mark this as <optional>true</optional>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressing it.


/**
* Configuration elements for scheduler config loader.
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I made a mistake. I thought initially that each of the configs in ConfigLoaderConfiguration were regular configs under storm.yaml. I should have read the code more closely, because they actually are sub-configs under "scheduler.config.loader.params".

I don't really like them being sub configs because they don't get validated. Which was the entire point of me asking you to add them into the Validated ServiceLoader. I also don't see a lot of value in having

scheduler.config.loader.params:
    scheduler.config.loader.uri: "artifactory+http://artifactory.my.company.com:8000/artifactory/configurations/clusters/my_cluster/ras_pools"
    scheduler.config.loader.timeout.sec: 30

over

scheduler.config.loader.uri: "artifactory+http://artifactory.my.company.com:8000/artifactory/configurations/clusters/my_cluster/ras_pools"
scheduler.config.loader.timeout.sec: 30

Could we change it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. I also felt that scheduler.config.loader.params makes things unnecessarily complicated. Changing it

targetURI = new URI(uriString);
scheme = targetURI.getScheme().substring(ARTIFACTORY_SCHEME_PREFIX.length());
} catch(URISyntaxException e) {
LOG.error("Failed to parse uri={}", uriString);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to throw an exception here. I don't think we can or should recover from a bad URI.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My movitation here is that if the ConfigLoader doesn't work correctly, we will just return null. Then scheduler will fall back to use "multitenant-scheduler.xml" file, or the configuration from storm.yaml. But I'm not sure if it's good

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally would prefer to see it fail fast here and not come up if the URI is not formatted correctly.

if (downloadURI != null) {
// Then get it and return the file as string.
String returnValue = doGet(null, location, host, port);
saveInArtifactoryCache(returnValue);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

saveInArtifactoryCache checks for null. I know it is not as readable here, but it does look like it works.

@Ethanlm
Copy link
Contributor Author

Ethanlm commented Aug 28, 2017

Tested it again. Should be good now.

Copy link
Contributor

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am +1 on this change. I would like to see us fail fast on the config issue, but even without it I am fine with the code as is.

@revans2
Copy link
Contributor

revans2 commented Aug 29, 2017

@HeartSaVioR you took a look at this patch before, do you want to take another look at it?

@Ethanlm could you squash your commits into just one?

@HeartSaVioR
Copy link
Contributor

@revans2 still +1.
Looks like we can merge this now.

@asfgit asfgit merged commit f1f9544 into apache:master Aug 30, 2017
@Ethanlm Ethanlm deleted the STORM-2201 branch September 28, 2017 13:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants