-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/wayback machine #286
base: master
Are you sure you want to change the base?
Feature/wayback machine #286
Conversation
@@ -0,0 +1,134 @@ | |||
const request = require('request'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is not critical, but I'd recommend to reorganize imports:
- put apart third part imports and local imports
- put improts in the alphabet order by name of lib:
const mustache = require('mustache');
const traverse = require('traverse');
const xml2js = require('xml2js');
You could check such organization in any of my js-files.
Such organization would help to read imports.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw, we don't have request
libs here, we already use axios
(https://github.com/axios/axios). The reason why we use axios (in the single place - interceptors):
- use single
user-agent
for all request - track request performance
Actually we could use another lib, but should have real motivation to switch to it.
function handler (app) { | ||
// Check parameter for Wayback qualifier | ||
let wayback = app.params.getByName('wayback'); | ||
if (wayback.includes('wayback') === false) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it doesn't look javascript style way. I'd recommend to reformat it to:
if (!wayback.includes('wayback')) {
} | ||
|
||
// Get url parameter | ||
var url = app.params.getByName('url'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd recommend to replace var
with const
and let
. It would be safer.
var url = app.params.getByName('url'); | ||
|
||
// Create wayback object | ||
var waybackObject = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why don't set waybackObject
just in place? I'd recommend to rewrite it in:
waybackObject = {
ulr,
earliestYear: 0,
so you would need later assignment
waybackObject.url = url;
}; | ||
|
||
waybackObject.url = url; | ||
var archiveQueryURL = 'http://web.archive.org/__wb/search/metadata?q=' + url; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we store all string constants in src/functions/config.js
so it would be much simpler to rewrite all endpoints in once if we decide to use a proxy, new API or something else.
@@ -0,0 +1,134 @@ | |||
const request = require('request'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw, we don't have request
libs here, we already use axios
(https://github.com/axios/axios). The reason why we use axios (in the single place - interceptors):
- use single
user-agent
for all request - track request performance
Actually we could use another lib, but should have real motivation to switch to it.
var XMLparser = new xml2js.Parser(); | ||
XMLparser.parseString(allData[1], function (err, result) { | ||
if (err) { | ||
debug('Uh-oh, the XML parser didn\'t work'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should gracefully continue dialog here, for example, propose to retry or ask something else.
@@ -0,0 +1,134 @@ | |||
const request = require('request'); | |||
const traverse = require('traverse'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should lodash
(https://lodash.com/) instead, it is much more powerful and we already have it in this app
@@ -0,0 +1,134 @@ | |||
const request = require('request'); | |||
const traverse = require('traverse'); | |||
const xml2js = require('xml2js'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all new imports should be installed by:
npm install --save xml2js
when you would need new lib for testing you should use instead:
npm install --save-dev xml2js
// Construct response dialog for action | ||
if (waybackObject.alexaUSRank !== 0) { | ||
waybackObject.speech = mustache.render(waybackStrings.speech, waybackObject); | ||
waybackObject.speech += ' and <say-as interpret-as="ordinal">' + waybackObject.alexaUSRank + '</say-as> in the United States'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd recommend to use template literals instead https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Template_literals
}); | ||
} | ||
|
||
function archiveEngine (archiveJSON, waybackObject) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just general comments. I'd recommend writing pure functions everywhere it possible.
Motivation:
- they are much easy to test
- easy to reuse
if (err) { | ||
debug('The XML parser didn\'t work'); | ||
waybackObject.speech = waybackStrings.error; | ||
dialog.ask(app, waybackObject); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we would like to change dialog flow here we should embed XMLparser.parseString
in Promises pipeline. For example by wrapping it to:
new Promise((resolve, reject) => {
XMLparser.parseString(allData[1].data, function (err, result) {
//...
})
Because right now we will reach dialog.close(app, waybackObject);
whether result you would get in XMLparser.parseString
.
Actually, I'd recommend to create unit tests for that part of the code - it would help a lot. All async scenarios are very tricky.
const _ = require('lodash'); | ||
const axios = require('axios'); | ||
const mustache = require('mustache'); | ||
const xml2js = require('xml2js'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you should install xml2js
with:
npm install --save xml2js
to add it to package.json
waybackObject.latestYear = yearsArray[yearsArray.length - 1]; | ||
|
||
// Traverse URL category | ||
const traverse = obj => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls put this function to ./utils
directory. Btw it is a good candidate for unit tests :)
functions/package.json
Outdated
@@ -37,7 +37,8 @@ | |||
"object.entries": "^1.0.4", | |||
"raven": "^2.6.0", | |||
"replaceall": "^0.1.6", | |||
"supports-color": "^5.4.0" | |||
"supports-color": "^5.4.0", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm, how do you use this lib here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm I'm not sure how that ended up here! I believe that's an atom module I use for formatting in the editor. That can come out.
waybackObject.speech = waybackStrings.error; | ||
dialog.ask(app, waybackObject); | ||
} else { | ||
alexaJSON = JSON.parse(JSON.stringify(result)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems that you forgot to run resolve
here, and btw, you don't need alexaJSON
here, because you can just pass a result to resolve
.
if (err) { | ||
debug('The XML parser didn\'t work'); | ||
waybackObject.speech = waybackStrings.error; | ||
dialog.ask(app, waybackObject); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it a good place for reject
?
.catch(function (error) { | ||
debug(error.message); | ||
}); | ||
alexaEngine(alexaJSON, waybackObject); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems alexaEngine
should be inside of Promise then pipeline
functions/src/utils/traverse.js
Outdated
@@ -0,0 +1,31 @@ | |||
const _ = require('lodash'); | |||
const {debug} = require('../utils/logger')('ia:actions:wayback-machine'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'ia:actions:wayback-machine'
should be 'ia:actions:utils:traverse'
dialog = mockDialog(); | ||
action.__set__('dialog', dialog); | ||
}); | ||
it('check to see that a promise is returned with network requests', () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please give some space here ;)
resolve(result); | ||
} | ||
}); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you need to store convertXML
promise. You are already inside of Promise.then
pipeline so you can return new Promise. And btw you need to pass archiveJSON
down the promises chain. So you could use something like:
return Promise.all([archiveJSON, new Promise((resolve, reject) => {
//...
}])
and btw it would need it because part of code from // Construct response dialog for action
will be called before: alexaEngine(JSON.parse(JSON.stringify(fulfilled)), waybackObject);
Btw I'd recommend to write unit test of this pipeline, so you would see how promise will fire one after another.
In additional I'd recommend to put:
XMLparser.parseString(allData[1].data, function (err, result) {
in separate function so it would improve readability of this pipeline.
expect(Promise.resolve()).to.be.a('promise'); | ||
}); | ||
|
||
it('check to see that axios request promises are working', () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we shouldn't test axios library.
describe('wayback machine', () => { | ||
let app; | ||
let dialog; | ||
let archiveRequest = axios.get('http://web.archive.org/__wb/search/metadata?q=cnn.com'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't request data for a test from an ourside. If you would need any data you should create fixture data.
let obj = { earliestYear: 0, latestYear: 0, totalUniqueURLs: 0 }; | ||
let result; | ||
|
||
action.__set__('archiveEngine', function (a, b) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems this test has never come inside of this mock. You never call handler
in this test on others. Actually, it is not clear what: check to see that archiveEngine is working
means here, could you be more specific?
config.wayback.ALEXA, app, waybackObject | ||
); | ||
|
||
return Promise.all([axios.get(archiveQueryURL), axios.get(alexaQueryURL)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't use callback nesting here. One of the main feature of Promises is avoiding callback hell. It should be something like:
action.then(a => {
return Promise.all(/*....*/);
})
.then(b => {
return Promise.all(/*....*/);
})
.then(c => {
});
// debug('Inside archiveEngine promise...'); | ||
// Create array of capture years and then find earliest year | ||
// and most recent year. | ||
let yearsArray = Object.keys(archiveJSON.captures); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw, it is better to sort years, to be sure that they are in the order
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
much better but still need to make few improvements
return Promise.all([archiveEngine(requestData[0].data, waybackObject), xmlConverter(requestData[1].data)]); | ||
}) | ||
.then(response => { | ||
return Promise.all([alexaEngine(response[1], waybackObject)]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you don't need to wrap alexaEngine(
call to Promise.all
here. Because you have single async function here. and btw you don't really need alexaEngine
to be async (with Promise) because it has sync logic.
debug('Country not found'); | ||
debug(e); | ||
} | ||
if (waybackObject.alexaWorldRank > 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need to check that alexaWorldRank > 0
here, and what should be in case when alexaWorldRank <= 0
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd recommend that function alexaEngine
would be in the pure-function way: https://en.wikipedia.org/wiki/Pure_function
You don't need to mutate waybackObject
here, just return rands:
function getAlexaRanks (alexaJSON) {
//.... some in logic to extract ranks from alexa json
return {
alexaWorldRank: 123456,
alexsUSRank: 1345,
};
}
and then you can merge the result inside of handler:
res = {}
// it merges res to empty object and then merge getAlexaRanks
res = Object.assign({}, res, getAlexaRanks (alexaJSON));
}); | ||
} // End of handler | ||
|
||
function archiveEngine (archiveJSON, waybackObject) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this function also should be pure-function and it doesn't need to be async
.then(response => { | ||
return Promise.all([alexaEngine(response[1], waybackObject)]); | ||
}) | ||
.then(a => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you don't need argument in arrow function if you don't use it:
.then(() => {
//... logic
})
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pure functions! We have significant progress here 👍 But still need to make some improvements
}); | ||
} // End of handler | ||
|
||
function archiveEngine (archiveJSON) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the inner logic of archiveEngine
and alexaEngine
are not asynchronous, so you don't need Promises here and there.
function archiveEngine (archiveJSON) { | ||
return new Promise(function (resolve, reject) { | ||
debug('Inside archiveEngine promise...'); | ||
let obj = { earliestYear: 0, latestYear: 0, totalUniqueURLs: 0 }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it could be const
and res
(result) is a more canonic variable name here
return Promise.all([archiveEngine(requestData[0].data), xmlConverter(requestData[1].data)]); | ||
}) | ||
.then(response => { | ||
let res = {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why don't merge response
right to waybackObject
?
|
||
return Promise.all([axios.get(archiveQueryURL), axios.get(alexaQueryURL)]) | ||
.then(function (requestData) { | ||
return Promise.all([archiveEngine(requestData[0].data), xmlConverter(requestData[1].data)]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because archiveEngine
and alexaEngine
are actually sync functions you don't need to wait for .then
before process them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confused about this one... Don't I have to wait on Axios requests to complete before feeding them to archiveEngine and alexaEngine? Even if those functions are sync functions, how can they operate before that data has arrived?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In other words, you don't do any async actions inside of archiveEngine
or alexaEngine
function, in particular, there are no any Axios
requests inside of them. So you don't need to wrap them to Promise and return a result, you just can get a result right away. It should simplify all pipeline a lot ;)
Sure you need to get responses from axios.get
above.
waybackObject.speech += '.'; | ||
} | ||
dialog.close(app, waybackObject); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may want gracefully cover .catch
case here because you throw reject
/error
in the pipeline, and request to a server could get an exception
|
||
return Promise.all([axios.get(archiveQueryURL), axios.get(alexaQueryURL)]) | ||
.then(function (requestData) { | ||
return Promise.all([archiveEngine(requestData[0].data), xmlConverter(requestData[1].data)]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In other words, you don't do any async actions inside of archiveEngine
or alexaEngine
function, in particular, there are no any Axios
requests inside of them. So you don't need to wrap them to Promise and return a result, you just can get a result right away. It should simplify all pipeline a lot ;)
Sure you need to get responses from axios.get
above.
); | ||
|
||
return Promise.all([axios.get(archiveQueryURL), axios.get(alexaQueryURL)]) | ||
.then(function (requestData) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw you can make this part of code more readable by using arguments unpacking:
.then(([archiveRes, alexaRes]) => {
return Promise.all([archiveEngine(archiveRes.data), xmlConverter(alexaRes.data)])
})
return res; | ||
} else { | ||
let error = new Error('alexaEngine didn\'t work.'); | ||
return error; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems we should throw
an error here. return
passes it as a successful response.
}) | ||
.then(([archiveEngineRes, xmlRes]) => { | ||
waybackObject = Object.assign(waybackObject, archiveEngineRes); | ||
return alexaEngine(xmlRes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alexaEngine
is sync function it doesn't return Promise
. So you can just merge its response right to waybackObject
object without waiting then
.
}) | ||
.catch(err => { | ||
debug('Wayback handler has an error: ', err); | ||
waybackObject.speech = waybackObject.error; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems waybackObject.error
would be undefined
here
expect(Promise.resolve()).to.be.a('promise'); | ||
}); | ||
|
||
it('check to see that archiveEngine is working', () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adaveinthelife before merging this PR please fix those tests or note that they are not working and should be fixed
This is my first iteration of a new feature for the Google Action: the Wayback Machine. The Wayback Engine function makes requests to metadata on archive.org and alexa rankings to get:
Still lots of work to be done here. I've attempted to write a basic mechanism for slot filling but it probably needs more helper methods. Hopefully Eugene can help me with that!