Python is an imperative language; we won’t try to be “purely functional” but “hidden state” could really screw us, so we’ll try to reduce that as much as possible.
We need:
- Use a gdbm backed activitystreams object storage
- For extensions, pyld
- verifiers
- method dispatch
DMD store should store in a dictionary like: … with the key being the “@id” of the toplevel activitystreams object.
{"asobj": {"@type": "Object",
"@id": "uuid:d773cb99-078b-496b-b3f0-012d3ade5930",
"blah": "blah"},
"private": {"blah": "blah"}}
We are going to have to handle inheritance manually, because there can be multiple types. We can’t use python’s inheritance system.
We need an “ASVocab” system to operate within. This one should have a memoized version of the json-ld expansion of the default activitystreams vocabulary, but it should also have a mapping of type URIs to ASClass objects.
The store will be used separately, should provide simple store and retrieve mechanisms.
Some complexity comes from the fact that in a “real world” system, we don’t just store and receive what’s been given to us. We need a way to trigger application-specific hooks.
Amy pointed out that it’s unnecessarily negative at one point
We should be able to consume this:
post_this = core.ASObj({
"type": "Create",
"id": "http://tsyesika.co.uk/act/foo-id-here/",
"actor": {
"type": "Person",
"id": "http://tsyesika.co.uk/",
"displayName": "Jessica Tallon"},
"to": ["acct:cwebber@identi.ca",
"acct:justaguy@rhiaro.co.uk",
"acct:ladyaeva@hedgehog.example"],
"object": {
"type": "Note",
"id": "htp://tsyesika.co.uk/chat/sup-yo/",
"content": "Up for some root beer floats?"}},
vocab.BasicEnv)
This is less important. The AS2 core doc says that implementations need to treat “id” as an alias for “@id” and “type” for an alias for “@type”, but does not say these need to be the default representations.
Should return the default vocabulary if none specified.
Probably what we ought to do is enforce that everything in an ASObj share a common @context setup.
This means, it doesn’t matter what’s on an @context coming into ASObj(), we slice it off and replace it with @context after the deepcopy_in (and all child objects, we don’t need an @context at all).
But we do need to be able to “ingest foreign material”, which means we should provide an Environment.ingest() method (or .ingest_foreign())
As for how to combine multiple contexts, it can be done via an array; see the local context part of the docs.
So that solves how to do things! (Though there may be some minor question as to what to do if the application’s @context is also an array… do we merge them? Will it work as-is?)
Basically, strip off the existing @context recursively with deepcopy_in, and add our context instead, if any exists.
This should expand out a json-ld document then compact it to the Environment’s own context.
The question is, can we use the “grafted-on @context” required in the above description, or do we have
Basically attach an @context to the json we have of it (though I guess this will happen automatically)
That didn’t make sense, because the extra context gets added to the asobj, so it doesn’t need to be implicit.
Currently we require setting up “object symbols” which self-document what a method does and etc, but they’re also slightly unwieldy to set up. It will be less precise, but maybe easier, to just use strings to represent what a method is.
Or rather, we should specify both a deepcopy_jsobj_in and a deepcopy_jsobj_out :)
So, if we’re accessing a key value pair where the value is a list of activitystreams objects, we’d like the activitystreams objects converted to ASObj objects as well.
RoyalCheckin or CheckUp
- checkup:CheckIn
- checkup:RoyalStatus
- checkup:Coupon
We can use the method dispatch system to handle this.
- Intro
- About ActiviPy
- Tutorial
- Core types
- Vocabulary
- Extending the environment
- Advanced Examples
How to do this?
We want to:
- probably preload a json-ld context
- Somehow make ASVocab objects useful for a
- make ourself more useful to ASObj objects
After all, I’m the one who started that project, and it’s abandoned…
Basically, the main reason is that we’d like to be able to do:
help(CollectionPage)
and get the appropriate useful info.
However, it’s still true that calling CollectionPage() should return a ASObj object, not a CollectionPage() object. Reason being that ActivityStreams objects can have multiple “@type” fields.
https://github.com/tobgu/pyrsistent
We more or less force/fake immutability right now, and maybe it makes more sense to just use something that is immutable
UPDATE: Canceled. More info on why Pyrsistent has a promising future, but can’t work for now.
This is its own project now. See this issue.
<evanpro> paroneayea: so, a couple of questions on that <evanpro> Does having a single package that is a producer and a consumer make sense? Or multiple packages? [12:18] <paroneayea> evanpro: my first goal is to make a library for the purpose of tests, basically along the lines of how you suggested… it’ll just store @id’s to a gdbm store. But I’ll design it in a way that afterwards, it can be used for something like pypump, and for using as2 stuff <paroneayea> but my first goal is: fulfill the test requirements <evanpro> Whoa! <paroneayea> while working towards something more general <paroneayea> gdbm is oldschool I know <evanpro> Wait what’s the GDBM for? <evanpro> I don’t understand what you need persistence for [12:19] <paroneayea> well it could also just be a dictionary <evanpro> Wouldn’t an AS2 library do something like <paroneayea> I was going along with your suggestion that you have a command-line submission tool <evanpro> JSON -> native language object <evanpro> and native language object -> JSON <paroneayea> evanpro: yes <paroneayea> evanpro: ok well maybe it can be in-memory only [12:20] <paroneayea> evanpro: my main concern is get the thing working <evanpro> 1s <evanpro> So I was thinking that a test command-line app might look like this <evanpro> https://gist.github.com/evanp/b49c3fc37caa21a323a1 <strugee> hey, would it be useful if I created next week’s meeting page and filled it with the stuff on the agenda that we didn’t get to? <strugee> e.g. we missed branching models <evanpro> strugee: YES! [12:23] <evanpro> Nice <paroneayea> evanpro: that might work nicely <strugee> will do <paroneayea> evanpro: okay, I will probably do something like that [12:24] <evanpro> paroneayea: and then a test driver would work like this <evanpro> https://gist.github.com/evanp/5d80c0aa3f168465d84d <evanpro> So that way you could call “testdriver.py dumpactivitytype.py” [12:25] <evanpro> as well as “testdriver.py dumpactivitytype.rb” <paroneayea> evanpro: ok <paroneayea> evanpro: I see <paroneayea> evanpro: we also want a way to show mutations [12:26] <paroneayea> evanpro: and side effects <paroneayea> eg update verbs should actually update the thing in store <evanpro> That might be too much for a data format to deal with <paroneayea> evanpro: I mean, for the test suite <evanpro> Yes, that’s what I’m saying <paroneayea> we want to be sure that activities can actually do the things they promise <evanpro> What I’m saying is that no we don’t [12:27] <evanpro> When we’re testing the social API, definitely <paroneayea> evanpro: this is why I was saying that there’s not much to do as in terms of a test suite <evanpro> But I think an activity streams library should just parse from JSON and export to JSON <paroneayea> the only thing your example checks really is that it’s valid right? <paroneayea> that it’s json, has the right fields, in the right types <evanpro> It checks that the activitystreams implementation library (the one that the dumpactivitytype.py script imports) can find the type of an activity [12:28] <evanpro> I realize that it appears to be really trivial <evanpro> But you’d need dozens of such test scripts [12:29] <evanpro> dumpactivityactortype.py <evanpro> dumpactivityactorid.py <evanpro> That kind of thing <paroneayea> evanpro: okay, so I’ll definitely support this. <evanpro> Another possibility is using command-line arguments <paroneayea> evanpro: though, one of the things is, the activitystreams vocabulary does describe things with side effects <paroneayea> I might test for that too, but I won’t make it so complex that you can’t do the simple tsts you ahve [12:30] <evanpro> That’s probably a fair point <evanpro> I would really, really strongly recommend that you first publish your intentions for the test format <paroneayea> evanpro: to the list? <evanpro> And that you concentrate on the bare minimum first <evanpro> Yes <paroneayea> evanpro: okay I’ll do that <evanpro> to the list [12:31] <paroneayea> evanpro: I was planning on working on deployment stuff this week, but it seems like this has become really urgent <paroneayea> so I’ll make it priority #1 <evanpro> So, one thing we can do when we have even a rudimentary test suite <evanpro> Is that we can start testing libraries <evanpro> And so we can start writing libraries [12:32] <paroneayea> evanpro: right <evanpro> We could even have a hackathon to implement in a lot of different languages <evanpro> And push implementations to npm, Ruby gems, pypi, etc. <paroneayea> evanpro: anyway, maybe now you can see why I was looking at gdbm; if we do have a command line test thing and we do promise to deliver tests on side effects <paroneayea> we need some way to persist things <paroneayea> but <paroneayea> I agree <paroneayea> there are tests that don’t need that <evanpro> Right, I hear you <paroneayea> focus on the other stuff first. <evanpro> They seem trivial but they are so important [12:33] <evanpro> Probably the big thing is defining what the interface between testdriver script and the tested script is <paroneayea> (and the reason why gdbm is even though it’s oldschool, it’s also dead easy to get working because it’s so “dumb”) <paroneayea> evanpro: right. <evanpro> Oh, yeah, GDBM is fine there <evanpro> I might suggest using command-line args, too [12:34] <paroneayea> evanpro: I get why you had a “don’t engineer this, chris!” reaction though :) <evanpro> maybe something like this <paroneayea> er <paroneayea> overengineer <evanpro> <dumpscript> –activity-part actor –part-property id <filename> <evanpro> <dumpscript> –activity-part=actor –part-property=id <filename> [12:35] <evanpro> Those are crummy names but 🤷 <evanpro> That way implementers don’t have to write 50 different testing shims <paroneayea> evanpro: I hear you <paroneayea> evanpro: well, it may even be easier [12:36] <evanpro> It may also be worthwhile to have a producer test <paroneayea> –extract [“actor”][“@id”] <evanpro> That takes in some parameters and outputs some JSON <evanpro> Sure <evanpro> I’d be a little worried about defining a query language <evanpro> But yeah <paroneayea> evanpro: it’s probably equally complex to define a billion arguments <evanpro> So a producer script might take arguments like this <paroneayea> for the different components [12:37] <evanpro> agreed! <evanpro> <buildscript> –actor-id=urn:test:whatever –actor-name=”Evan Prodromou” –activity-type=”Like” –object-id=urn:test:whatever2 –object-name=”This terrible test” [12:38] <evanpro> But yeah pretty nightmarish <paroneayea> evanpro: so is the idea that this should spit out a success/failure code or <evanpro> Oh, no! <evanpro> It should spit out JSON! <paroneayea> just extract the right part? <paroneayea> okay <paroneayea> evanpro: and it should validate, right? [12:39] <evanpro> dumpscript == take JSON, just spit out some extracted part of it <evanpro> buildscript = take params, spit out JSON <paroneayea> oh I see. <paroneayea> okay that makes much more sense. <paroneayea> echoscript == take json, dump out json <paroneayea> sorry ;) <evanpro> dumpscript and buildscript are provided by the implementer to test the implementation [12:40] <evanpro> and there’s a test driver to run them <evanpro> so “testdriver dumpscript.py buildscript.py” <evanpro> Would run all the tests <evanpro> Or something like that <paroneayea> hm ok.... <paroneayea> evanpro: I don’t understand testdriver [12:41] <paroneayea> what does it do? <evanpro> Something like https://gist.github.com/evanp/5d80c0aa3f168465d84d
<evanpro> dumpscript == take JSON, just spit out some extracted part of it
import activitystreams
json = parseCommandLineFileArgument()
activity = Activity.fromJSON(json)
print activity.type
<evanpro> <dumpscript> –activity-part=actor –part-property=id <filename>
<evanpro> <dumpscript> –activity-part=actor –part-property=id <filename> <evanpro> Those are crummy names but 🤷 <evanpro> That way implementers don’t have to write 50 different testing shims <paroneayea> evanpro: I hear you <paroneayea> evanpro: well, it may even be easier [12:36] <evanpro> It may also be worthwhile to have a producer test <paroneayea> –extract [“actor”][“@id”] <evanpro> That takes in some parameters and outputs some JSON <evanpro> Sure <evanpro> I’d be a little worried about defining a query language <evanpro> But yeah <paroneayea> evanpro: it’s probably equally complex to define a billion arguments <evanpro> So a producer script might take arguments like this <paroneayea> for the different components [12:37] <evanpro> agreed! <evanpro> <buildscript> –actor-id=urn:test:whatever –actor-name=”Evan Prodromou” –activity-type=”Like” –object-id=urn:test:whatever2 –object-name=”This terrible test” [12:38] <evanpro> But yeah pretty nightmarish
<evanpro> buildscript = take params, spit out JSON
<evanpro> so “testdriver dumpscript.py buildscript.py”
Okay, so what do we want to do here?
- Vocabularies might provide an “implied context”. That’s the biggest issue, because otherwise it can be inferred unambiguously from expanding the document.
- Mostly, we might not want to re-read things?
This last one is a good goal but maybe we shouldn’t worry about it immediately.
Here’s the options from the JsonLdProcessor code:
class JsonLdProcessor(object):
"""
A JSON-LD processor.
"""
# [...]
def expand(self, input_, options):
"""
Performs JSON-LD expansion.
:param input_: the JSON-LD input to expand.
:param options: the options to use.
[base] the base IRI to use.
[expandContext] a context to expand with.
[keepFreeFloatingNodes] True to keep free-floating nodes,
False not to (default: False).
[documentLoader(url)] the document loader
(default: _default_document_loader).
:return: the expanded JSON-LD output.
"""
- we probably want to be able to set expandContext.
- the documentLoader could thus possibly come with some context preloaded. But that’s kind of an optimization.
At least we know the two main steps now?
Update: It turns out the first of these is much simpler than we originally were thinking! There’s only one implied context in ActivityStreams, so we can hardcode the expandContext.
Should be passed into the environment, but possibly built out of the vocabulary.
The documentLoader seems to just be a function accepting a URI, and raising JsonLdError if something goes badly.
{
'contextUrl': None,
'documentUrl': url,
'document': data.decode('utf8')
}
So we could write a factory function that takes a mapping of {url: document}
def make_simple_loader(url_map, load_unknown_urls=True):
def loader(url):
# foo
return loaded_url
return loader
One way or another we want to reduce the amount of data duplicated from the building of the Environment
Note that normal python classes can’t work here.
This way we can catch any asobj types
We should do this like in the ANSI Common Lisp book, where we remove duplicates, but we remove duplictes but keep the last appearance of a “class”
This is trickier than one may think; we can’t do Python style method resolution because an activity may have multiple types.
Something like:
from activipy import vocab
root_beer_note = vocab.Create(
actor=vocab.Person(
"http://tsyesika.co.uk",
displayName="Jessica Tallon"),
to=["acct:cwebber@identi.ca"],
object=vocab.Note(
"http://tsyesika.co.uk/chat/sup-yo/",
content="Up for some root beer floats?"))
This should be able to flow pretty naturally out of our types.py interface.
So here’s how this thing works.
There’s an environment, which has a mapping between tuples of (method_symbol, Vocab) and method_to_call.
# method name description invocation method
save = Method("save", "Save things", handle_one)
gather_something = Method("gather_something", "Accrues some info", handle_map)
myenv = Enviroment(
mapping={
(save, Note): note_save,
(save, Object): basic_save,
})
handle_one(myobj, save, db)
This way, using the inheritance_chain() method, we can handle various types of method handling:
- handle_one
- handle_map
- handle_fold
However, we have enough metadata here to provide some sugar.
myenv = Environment(
mapping={bla bla},
vocab=vocab)
activity = Environment.c.Activity("http://oh/snap")
activity.m.save(db)
# or maybe even just activity.save()
This would have to mean that ASObj gets a method dispatch keyword option on construction, which might be a-ok.
I think this is a pretty good approach.
save_object = Method("save things", "handle_one")
myenv = Enviroment(
mapping={
(save_object, Note): note_save,
})
handle_one(myobj, "save_object", db)
handle_one(myobj, save_object, db)
# more pythonic optional interface
# a bit leaky though
myenv = MetaEnviroment(
mapping={
(save_object, Note): note_save,
}
vocab=[BasicVocab]
)
myenv.Person("foo")
Person()
Should methods be able to themselves take advantage of method dispatch? If so, they will need “env” as first argument.
Here’s the problem.
Assume we made an activity like this:
ROOT_BEER_NOTE_VOCAB = vocab.Create(
"http://tsyesika.co.uk/act/foo-id-here/",
actor=vocab.Person(
"http://tsyesika.co.uk/",
displayName="Jessica Tallon"),
to=["acct:cwebber@identi.ca",
"acct:justaguy@rhiaro.co.uk"],
object=vocab.Note(
"htp://tsyesika.co.uk/chat/sup-yo/",
content="Up for some root beer floats?"))
Now assume we made one like this:
ROOT_BEER_NOTE_JSOBJ = types.ASObj({
"@type": "Create",
"@id": "http://tsyesika.co.uk/act/foo-id-here/",
"actor": {
"@type": "Person",
"@id": "http://tsyesika.co.uk/",
"displayName": "Jessica Tallon"},
"to": ["acct:cwebber@identi.ca",
"acct:justaguy@rhiaro.co.uk"],
"object": {
"@type": "Note",
"@id": "htp://tsyesika.co.uk/chat/sup-yo/",
"content": "Up for some root beer floats?"}})
Now even worse:
ROOT_BEER_NOTE_JSOBJ = types.ASObj({
# AAAAAAAAAAA
"@type": "http://www.w3.org/ns/activitystreams#Create",
"@id": "http://tsyesika.co.uk/act/foo-id-here/",
"actor": {
"@type": "Person",
"@id": "http://tsyesika.co.uk/",
"displayName": "Jessica Tallon"},
"to": ["acct:cwebber@identi.ca",
"acct:justaguy@rhiaro.co.uk"],
"object": {
"@type": "Note",
"@id": "htp://tsyesika.co.uk/chat/sup-yo/",
"content": "Up for some root beer floats?"}})
So…
- we really need to know about the whole set of vocabularies in order to do ASObj.type_astype()
- Obviously, we also need to for method dispatch also
- It could be then that we don’t load ASObj.vocab, but ASObj.env
- Also, in general you can always do env.asobj_astypes(asobj)
- Thus, we should also provide env.asobj_method(asobj, method_symbol)
- Which means also, more obviously, and as a precedent, we must provide Environment.asobj_astype_chain(asobj)!
This also means that users should, in general, not use ASObj.type_astype(), unless they’re using the “sugar” edition which comes from supplying an environment.
We might want to also provide an expanded=True argument to some of those methods.
OR, maybe we can do “cheapest available” determination of an ASType.
What are the ways we might go about pulling down an ASType?
- By short ID… but this requires this short ID be marked “safe” for short expansion
- By already known URI
- By json-ld examination (most expensive!)
Do we really want an expand=None? Maybe that’s kind of dumb
The question is, where do we mark whether its safe to consider the short_id as a safe representation from? Is it in the environment or in the vocab?
The vocab may make sense because we could do a shortids=load_from_vocabs((Vocab1, None), (GMGVocab, “gmg:”))
JF2 is the new MicroFormats json representation, but there’s a new verion that has a json-ld context. Add it as a vocabulary!
https://github.com/w3c-social/Social-Syntax-Brainstorming/wiki/jf2