simple example of conversion functions #2

jbenet · 2014-04-17T22:03:50Z

not final, food for thought.

Want output

{
  'name': 'Juan Batiz-Benet'
  'city': 'San Francisco, CA'
}

Input Type FOO

{
  'name': {
    '@type': 'pandat/name',
    'label': 'NAME',
    'codec': 'pandat/name-last-name-first'
  },
  'addr': {
    '@type': 'pandat/us-street-address',
    'label': 'ADDR',
  }
}

Output Type BAR

{
  'name': 'pandat/name',
  'city': 'pandat/us-city',
}

You can write a conversion function, use it and/or publish it to pandat:

(excuse this interface, might be simplified some)

var Foo2Bar = pandat.Conversion({'invertible': 'false'}, [Foo], [Bar]);

Foo2Bar.convert = function(foo) {
  return {
    'name': pandat(Foo.name['@type'], Bar.name['@type'], foo.name),
    'city': pandat(Foo.addr['@type'], Bar.city['@type'], foo.addr)
  }
}

Or, pandat might be able to generate the function, with some hints about how the names map to each other. (not quite sure what the right interface is here, but will think about it.)

The text was updated successfully, but these errors were encountered:

yoshuawuyts · 2014-04-17T23:50:47Z

I'd like to see something more along the lines of this:

Source:

{ 
  "name": {
    "type": "pandat/name-last-name-first",
    "label": "NAME"
  },
  "city": {
    "type": "pandat/us-street-address",
    "label": "ADDR"
  }
}

Output:

{ 
  "name": {
    "type": "pandat/name",
    "label": "NAME",
    "source": "NAME"
  },
  "addr": {
    "type": "pandat/us-city",
    "label": "CITY",
    "source": "ADDR"
  }
}

Converter:

/**
 * Module dependencies
 */

var object1 = require('./object1.json')
var fooSchema = require('./foo.json');
var barSchema = require('./bar.json');
var pandat = require('pandat');

/**
 * Initialize converter.
 *
 * @param {Object} sourceSchema
 * @param {Object} targetSchema
 * @return {Function}
 */

var converter = pandat.Conversion(fooSchema, barSchema, {invertible: false});

/**
 * Execute conversion
 */

var resultObject = converter(object1);

I think that if you design your relations beforehand, there'll be no need for further declarations. Such an implementation would allow for more flexibility, and result in a cleaner API.

Also: I didn't quite catch the difference between a codec and conversion, could you explain what you mean by that?

jbenet · 2014-04-18T09:18:45Z

Hello!

I think that if you design your relations beforehand, there'll be no need for further declarations. Such an implementation would allow for more flexibility, and result in a cleaner API.

Yeah! What i meant above by "pandat might be able to generate the function, with some hints about how the names map to each other".

"source": "NAME"

I like this relational mapping, though it won't quite happen on the output type, as the output type may be an input type elsewhere. Relevant to mention here is that users will be reusing types published by others. Totally possible to just have to specify:

var converter = pandat.Conversion('jbenet/foo', 'jbenet/bar')

Given I published foo and bar schemas :)

I didn't quite catch the difference between a codec and conversion, could you explain what you mean by that?

Yeah. A Codec is a named pair of functions to encode and decode between raw data and typed objects. For example, see https://github.com/jbenet/pandat/blob/master/stdlib/json_codec.js and https://github.com/jbenet/pandat/blob/master/stdlib/xml_codec.js (these are just examples, nothing works yet). Codecs don't have to be as general as json or xml. They can be type-specific. See https://github.com/jbenet/pandat/blob/master/stdlib/date_type.js#L23-L35 (again nothing works yet, there's errors there :] ). Codecs can be published and installed (npm modules).

A Conversion is a function converting between two types. The example above shows converting between Foo and Bar. While it's certainly possible to generate conversion functions from relations (inferred based on the types, or specified with source/target keys), many conversion functions will be complex and require programming. These would be publishable/installable modules as well.

Lmk if that makes sense? Will put this all on the Readme.

yoshuawuyts · 2014-04-18T15:31:27Z

Your explanation of Codec makes sense. But before I start suggesting any changes, let me check if I understood it correctly:

A conversion has a:

input schema
output schema
link schema, which plots the transformation from A to B

An input schema has:

Types, which define the data type
Labels, which handle as unique id's
Codecs, which prepare the data for conversion

An output schema has:

Types, which define the data type
Codecs, which prepare the data for consumption

Or outputSchema == inputSchema? Let me know if this sounds about right.

jbenet · 2014-04-18T17:11:30Z

Output schema == input schema. They're the same thing. They define Types. Types can be used as inputs or outputs in a conversion.

Other than that, right on!

yoshuawuyts · 2014-04-19T00:04:45Z

I really dislike the @something syntax. I don't think keys should be namespaced if they're not used outside pandat/transform.

And couldn't this:

/**
 * Module dependencies
 */

var outputSchema = require('./bar');
var inputSchema = require('./foo');
var linkSchema = require('baz');
var pandat = require('pandat');

/**
 * Initialize converter.
 */

var Foo2Bar = pandat.Conversion({'invertible': 'false'}, [inputSchema], [outputSchema]);

Foo2Bar.convert = function(linkSchema) {
  return {
    'name': pandat(inputSchema.name['@type'], outputSchema .name['@type'], linkSchema.name),
    'city': pandat(inputSchema.addr['@type'], outputSchema .city['@type'], linkSchema.addr)
  }
}

be rewritten to this:

/**
 * Module dependencies
 */

var outputSchema = require('./bar');
var inputSchema = require('./foo');
var linkSchema = require('baz');
var pandat = require('pandat');

/**
 * Export converter.
 */

module.exports = var converter = pandat({'invertible': 'false'});

converter.schema = {
  'name': [inputSchema.name, outputSchema.name, linkSchema.name],
  'city': [inputSchema.addr, outputSchema.city, linkSchema.city],
}

You could use an internal function to execute converter.schema. Not sure if closures are passed around correctly though.

The less friction the API causes, the more developers will love using it. Imo things like @type should be evaded. What do you think?

jbenet · 2014-04-19T03:54:02Z

I really dislike the @something syntax.

Take that up with json-ld.org :)

I don't think keys should be namespaced if they're not used outside pandat/transform.

They are, the goal is for all transformer objects to have a definition in JSON-LD. (sorry, haven't made it clear in the REAMDE.) They'll have their own @context, etc. The trick is that the library can fill in a lot of the standard stuff, so:

(s/pandat/transformer/ in your mind here)

t = pandat.Type({
  'name': {
    '@type': 'pandat/name',
    'label': 'NAME',
    'codec': 'pandat/name-last-name-first'
  },
  'addr': {
    '@type': 'pandat/us-street-address',
    'label': 'ADDR',
  }
})

fill's in:

> t.src
{
  '@context': 'http://pandat.io/context/pandat.jsonld',
  '@type': 'Type',
  'codec': 'pandat/identity-codec',
  'schema': {
    'name': {
      '@type': 'pandat/name',
      'label': 'NAME',
      'codec': 'pandat/name-last-name-first'
    },
    'addr': {
      '@type': 'pandat/us-street-address',
      'label': 'ADDR',
    }
  }
}

See https://github.com/jbenet/pandat/blob/master/js/type.js

Though none if this is final. Will try to have working code by end of this weekend.

max-mapper · 2014-04-19T03:57:13Z

just for the sake of argument, how about this for a minimum viable JSON type:

t = pandat.Type({
  'name': {
    'type': 'name',
    'label': 'NAME',
    'codec': 'name-last-name-first'
  },
  'addr': {
    'type': 'us-street-address',
    'label': 'ADDR',
  }
})

e.g. default type to @type if @type doesn't exist (agreed that @ symbols in keys are weird) and default all types to pandat/ if no other 'namespace' is specified

jbenet · 2014-04-19T03:59:02Z

As for the example, the goal is that most users won't have to write their own conversion functions at all, simply use published ones. Some people will, and in those cases, both doing it in code directly or with a relational schema (expressing the mapping of one type to the other) that allows transformer to generate the code. Precisely like you suggest! :)

You could use an internal function to execute converter.schema

👍

jbenet · 2014-04-19T04:03:47Z

just for the sake of argument, how about this for a minimum viable JSON type:

Yeah! lgtm! both filling in the @ and default namespace. If we run into problems, figure it out then.

yoshuawuyts · 2014-04-19T11:52:01Z

👍

jbenet · 2014-04-26T02:55:28Z

Turns out the @context can symlink type -> @type 👍

@id is not required for a valid JSON-LD document. Also note that you can alias "@id" to something less strange looking, like "id" or "url", for instance.

From frictionlessdata/datapackage#110 (comment)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

simple example of conversion functions #2

simple example of conversion functions #2

jbenet commented Apr 17, 2014

yoshuawuyts commented Apr 17, 2014

jbenet commented Apr 18, 2014

yoshuawuyts commented Apr 18, 2014

jbenet commented Apr 18, 2014

yoshuawuyts commented Apr 19, 2014

jbenet commented Apr 19, 2014

max-mapper commented Apr 19, 2014

jbenet commented Apr 19, 2014

jbenet commented Apr 19, 2014

yoshuawuyts commented Apr 19, 2014

jbenet commented Apr 26, 2014

simple example of conversion functions #2

simple example of conversion functions #2

Comments

jbenet commented Apr 17, 2014

not final, food for thought.

yoshuawuyts commented Apr 17, 2014

jbenet commented Apr 18, 2014

yoshuawuyts commented Apr 18, 2014

jbenet commented Apr 18, 2014

yoshuawuyts commented Apr 19, 2014

jbenet commented Apr 19, 2014

max-mapper commented Apr 19, 2014

jbenet commented Apr 19, 2014

jbenet commented Apr 19, 2014

yoshuawuyts commented Apr 19, 2014

jbenet commented Apr 26, 2014