Web   ·   Wiki   ·   Activities   ·   Blog   ·   Lists   ·   Chat   ·   Meeting   ·   Bugs   ·   Git   ·   Translate   ·   Archive   ·   People   ·   Donate

#sugar-newbies, 2016-08-11

 « Previous day | Index | Today | Next day »     Channels | Search | Join

All times shown according to UTC.

Time Nick Message
05:48 icarito has quit IRC
05:49 icarito <icarito!bip@2001:4830:134:7::11> has joined #sugar-newbies
11:03 meeting <meeting!~sugaroid@rev-18-85-44-69.sugarlabs.org> has joined #sugar-newbies
16:57 tony37 <tony37!~tony@tmo-114-47.customers.d1-online.com> has joined #sugar-newbies
16:57 iamutkarshtiwari <iamutkarshtiwari!65d4445a@gateway/web/cgi-irc/kiwiirc.com/ip.101.212.68.90> has joined #sugar-newbies
16:57 iamutkarshtiwari tony37: Hi
16:58 #startmeeting
16:58 meeting Meeting started Thu Aug 11 16:58:00 2016 UTC. The chair is iamutkarshtiwari. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:58 Useful Commands: #action #agreed #help #info #idea #link #topic #endmeeting
16:58 tony37 hello
16:58 iamutkarshtiwari How are you?
16:58 tony37 fine and you?
16:58 iamutkarshtiwari I am good. At home :)
16:59 tony37 i am at dinner in the campground
16:59 iamutkarshtiwari Any luck with testing getbooks?
16:59 Oh..
16:59 You should finish your dinner first.
16:59 We can talk later.
16:59 tony37 The campground internet blocks downloading the newer Rachel version. I plan to upgrade versions next month when I get back to the US.
16:59 Dinner and meeting is cool
17:00 iamutkarshtiwari You must be using both hands for typing then how would you eat ? :P
17:00 tony37 when it is your turn :)
17:00 iamutkarshtiwari oh :D
17:01 Did you try replacing your index.html files with mine?
17:01 tony37 I'll try GetBooks tomorrow with the pointers you gave me to the processing steps.
17:02 The problem with replacing index.html is that we have to handle the ones the server gives us.
17:02 iamutkarshtiwari I agree
17:02 We need to find a better way to parse the index.html files without touching/modifying them.
17:02 I'll do that after infoslicer.
17:03 tony37 So I think we will need the cfg file to call a python function that handles the html response
17:03 iamutkarshtiwari Only the deployer has to provide the url to the respective index.html files. Rest 'GetBooks' will do itself.
17:04 tony37 Agreed. I see the deployer (or packager) writing a BeautifulSoup script to process the html
17:05 iamutkarshtiwari But if you would be testing it with an older version of Rachel collections then I doubt you will find any success :/
17:05 tony37 No, I need to handle the index.html as it is which may be different from the way the newer version works.
17:05 iamutkarshtiwari I'd suggest you to download the latest collection an then test it. For now our primary target should be making it work on your side.
17:06 But you see that the code I wrote is dependent on the index.html file I provided.
17:06 tony37 Sadly, the ck12 is about 1.6GB and the other is about 600MB. Both are too big for the ISP here.
17:07 iamutkarshtiwari Can you provide the index.html files for your current collections?
17:07 tony37 Yes. I think you need to consider this a sample of the code needed. Remember the goal is to provide access to any ebook collection provided byt the schoolserver
17:07 iamutkarshtiwari I can try modifying them
17:08 tony37 Yes, I'll send them as an email attachment
17:08 iamutkarshtiwari But parsing with BS would depend upon how books are setup in their index.html files.
17:08 If the schema get's disturbed, it won't work.
17:08 tony37 Exactly. So we will need to give some examples and perhaps.
17:09 This is the basis for BeautifulSoup -if the online source changes the html then the BeautifulSoup script will need to be changed.
17:09 iamutkarshtiwari Yup
17:09 tony37 perhaps some guidance
17:10 iamutkarshtiwari I'll fix your index.html files and try to help you make it work.
17:10 #Infoslicer
17:10 tony37 My real concern is the factoring of GetBooks. If we can access Rachel on the schoolserver, GetBooks should also be able to do it online.
17:11 iamutkarshtiwari It should.
17:11 For that we would be needing OPDS technique
17:12 tony37 The real issue is not whether the content is online or on the schoolserver but whether it is based on opds or not
17:12 iamutkarshtiwari Would you like it to be OPDS dependent?
17:12 tony37 No
17:12 iamutkarshtiwari *for online collections*
17:13 tony37 Perhaps the way to go is to add a parameter to the cfg file in addition to the url. One option could be opds, the other could be a python module name
17:13 iamutkarshtiwari Parsing online collections with BS would be an overkill.
17:13 Their html files contains lots of uncessary scripts.
17:13 tony37 In my experience - probably not. Consider accessing pustakalaya
17:14 iamutkarshtiwari They do that with BS?
17:14 tony37 The online version makes extensive use of php scripts. The offline version uses Django
17:15 iamutkarshtiwari Oh.. I see.
17:15 tony37 What is common is that urllib loads an html file to be processed
17:16 iamutkarshtiwari That's what we are currently doing with GetBooks
17:16 tony37 Yes. We really don't care what the server side needs to do to provide an html response, we need to process the response to display on the screen
17:17 iamutkarshtiwari Would you prefer me finish adding both 'search' and 'online access to Rachel' to GetBooks first or #Infoslicer ?
17:18 tony37 Infoslicer is the priority. I am meeting with Rudolf Simon on Sunday (the host of the Wikimedia meeting). I don't expect InfoSlicer to be ready to demonstrate then but I will show him it working with an online wiki
17:20 iamutkarshtiwari I still have to add search functionality to offline collections. Could you please guide me with it?
17:21 tony37 Search is a problem indeed. In general, the scenario should be that the first url downloads a list of topics, the user selects a topic, then we
17:21 download a list of books in that topic
17:21 So no search should be needed.
17:22 iamutkarshtiwari Should I remove search functionality from that? What if the offline book collection is too huge?
17:22 Search would save time
17:22 tony37 Agreed. For example, the real gutenberg collection has 45000 books + and there is no breakdown by topic
17:23 iamutkarshtiwari So there is just 1 level hierarchy ?
17:23 tony37 However, the problem is that search needs to be performed server side (you don't want to download the full collection to search it).
17:24 However, if 40 XOs concurrently search the server side processor will be overwhelmed.
17:24 iamutkarshtiwari Point!
17:25 tony37 This is the benefit of Rachel's gutenberg collection, they have formed collections of the books by topic.
17:26 The kiwix provide an index to facilitate search server-side (the index is nearly as big as the collection itself!). Using hash techniques reduces the processor load
17:27 So in the case of GetBooks, the search could use the online kiwix search and return the list found by the search
17:27 iamutkarshtiwari Are you using kiwix for testing GetBooks?
17:28 tony37 Not yet. But the Tim Moody (XSCE) has created a guteberg zim file with an index so the full collection is served by kiwix
17:29 iamutkarshtiwari Would our current BS techique work with that zim file?
17:29 tony37 Yes, in any case the query returns an html file.
17:30 iamutkarshtiwari Please do send me your index.html files. I'll try to fix them.
17:31 Shall we discuss about #InfoSlicer?
17:31 tony37 Oops! The idea is that you will figure out how to handle them.
17:31 Yes, on #infoslicer - go
17:32 iamutkarshtiwari I have been going through the code for a while to understand how everything works.
17:32 tony37 Not an easy taks
17:32 task*
17:32 iamutkarshtiwari It seems that IS uses MediaWiki API to parse the data from the online pages
17:32 and then it separates the text from images
17:33 Are you sure BS would be enough to do all that?
17:33 tony37 I really need to use InfoSlicer - I haven't in the past since it was dependent on the internet. Our users really need an offline version of the article as a web page
17:34 So I really don't know what the text and images are being separated
17:34 iamutkarshtiwari IS separates text and images so that it becomes easy for users to customize their articles.
17:34 tony37 If the web page is saved to the Journal we can use Browse to open it
17:35 iamutkarshtiwari They can choose what text to cut and what images to copy.
17:35 That is the end result.
17:35 tony37 It is really quite easy anyway. Users can copy an image (which in Sugar gets copied to the Journal) or copy selected text
17:35 iamutkarshtiwari Start is with deciding how to separate and show text and images in InfoSlicer from zim files.
17:36 tony37 So zim is not important if the urllib response is a web file.
17:36 iamutkarshtiwari No. InfoSlicer allow users to drag and drop images/text from the online wiki article in realtime.
17:36 tony37 How important is that versus a copy and paste?
17:37 iamutkarshtiwari Our main challenge is to segregate images from text and show it in their respective panels.
17:38 By copy/paste I meant ~ drag/drop. Sorry my bad :/
17:38 tony37 Yes - as I say I am not familiar with the InfoSlicer. Perhaps, we could have a mode which just shows the page.
17:39 Alternatively, why doesn't the code already there handle an html file?
17:39 iamutkarshtiwari That would be a major change to the activity. We should try not to make any huge changes to UI of InfoSlicer.
17:39 It does handle the html file.
17:39 tony37 Unless necessary
17:40 iamutkarshtiwari But most of the tasks are handle by MediaWiki api.
17:40 tony37 So why does it not handle an html file delivered from a schoolserver
17:40 So this is the rub. We have Mediawkiki on the school server but the zim articles do not work with Meidawiki
17:40 iamutkarshtiwari MediaWiki api is desined to extract meaningful data from the online wiki pages.
17:41 Exactly!
17:41 It that had been possible, Mr. Bender would have already added offline support to it. ;)
17:42 tony37 So if you get a page from kiwix, you may need to have an alternate screen showing the page with the option to save it to the Journal (or the user could copy/paste text or images from the display to another activity
17:42 iamutkarshtiwari That would also be an overkill.
17:43 The drag and drop functionality is already there in IS
17:43 tony37 Which part would be overkill?
17:43 iamutkarshtiwari All we need to do is figure out a way to display the wiki content to their respective panes.
17:43 Copy/pasting text/images from display to another activity.
17:44 There wouldn't be need for an another activity when IS is capable of doing all that stuff.
17:44 We need to leverage it out of it.
17:44 tony37 Great - but then you need to figure out how to do that with html delivered from Kiwix
17:45 iamutkarshtiwari I am working on it. Was just stuck at figuring out how to handle and search through those zim files without using MediaWiki api
17:45 tony37 BeautifulSoup may be the solution
17:46 iamutkarshtiwari BS seems to be very solution specific to me. Though it makes things work.
17:46 tony37 The concept is to get the html and then process it for display. This means doing the work locally instead of server-side.
17:46 iamutkarshtiwari #FinalEvalutions are just around the corner.
17:47 Agree
17:47 tony37 So type faster!
17:47 iamutkarshtiwari Yes!
17:48 We also need to document our previous work.
17:48 tony37 Just kidding. My experience is that every GSOC finishes before the work is done. You have accomplished a lot. I am really happy that most of what you have done will become a part of Sugar.
17:49 iamutkarshtiwari I am really looking forward to that :)
17:49 tony37 The work with InfoSlicer may be helpful at the Wikimedia conference (this is a major annual meeting of Wikimedia Germany which this year Rundolf Simon, my Rwanda sponsor, is hosting.
17:49 iamutkarshtiwari But we need to convince our community for merging our patches
17:50 tony37 Not your problem per GSOC. Could be if you continue as a Sugar Labs volunteer contributor.
17:50 iamutkarshtiwari Wish you luck for that conference. I am on it. Hopefully will finish offline support soon.
17:50 I am happy to help!
17:51 I am highly attached to the kind of work being done here. I started my open source Journey with Sugar Labs.
17:51 tony37 The conference is Sept. 16-18 so we have some time. I am hoping you will have some time to continue working as a volunteer - although you will be starting school which will take most of your time.
17:52 iamutkarshtiwari Whatever I have accomplished today is because of mentors at Sugar Labs and you :)
17:52 Yes. But I'll try to manage time for that.
17:53 tony37 You have certainly been exposed to the real world of software development. There have actually been formal studies of software development.
17:53 iamutkarshtiwari Yes! I have started to get a hang of it. Here we could make our own design and UI decisions.
17:54 tony37 These studies show that bringing an experienced developer into an ongoing project causes delays because the new developer questions the design decisions already made.
17:54 iamutkarshtiwari Yes.
17:55 tony37 This is a learning moment which may help you understand similar issues in the future.
17:56 iamutkarshtiwari I am still learning a lot by contribute to such great projects. Learning is an ever ending process..
17:56 But it will definitely help me in future!
17:56 tony37 One of the lessons I have learned is that working code trumps design ideas. So I like to have a clear idea of what is needed before starting to write code.
17:56 iamutkarshtiwari You must have finished your dinner by now? :P
17:57 tony37 Indeed, I did. It was excellent. Now I have only the rest of my beer to finish.
17:57 iamutkarshtiwari Umm... Beer!
17:57 tony37 Once started on implementation, try to get to working code and then consider alternative designs. Otherwise, you stop work waiting on a design decision.
17:58 The Germans put a lot of stock in their beer!
17:58 iamutkarshtiwari Enjoy you beer ;)  Beer is also Alan's favourite.
17:58 Yes. I'll keep that in mind.
17:59 See you soon
17:59 #endmeeting
17:59 meeting Meeting ended Thu Aug 11 17:59:08 2016 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot. (v 0.1.4)
17:59 Minutes: http://meeting.sugarlabs.org/s[…]-11T16:58:00.html
17:59 tony37 It is interesting. In France, wine is more important than beer - but cheaper. However, in Germany, beer is number one and cheaper. In both cases, Coke can be more expensive.
17:59 meeting Log:     http://meeting.sugarlabs.org/s[…]16-08-11T16:58:00
17:59 iamutkarshtiwari haha :D
17:59 tony37 What really amazes me is that bottle water sells for the same price as Coke.
17:59 iamutkarshtiwari In france and Spain they use wine in their food.
17:59 To enhance the flavours.
18:00 tony37 At every opportunity - Italy as well.
18:00 iamutkarshtiwari Same here in Goa.
18:00 tony37 I think the Portuguese tradition is similar to that of Spain.
18:00 iamutkarshtiwari Here the water bottles are expensive than beer bottles.
18:00 They all are closely related to each other both in looks and language
18:01 tony37 Although not in politics.
18:01 iamutkarshtiwari I haven't been to those countries so can't say about it.
18:02 But Nepalis are very peace loving people ;)
18:02 And so are the girls, they are very beautiful!
18:03 See you soon
18:04 Bye
18:04 #endmeeting
18:04 tony37 Off the technical subject, but one thing that is fascinating to me is Venice. Until 1500, it was a powerhouse in Europe because it controlled the silk route trade by bringing the goods from the Eastern Mediterranean to Venice and then on by land to European markets. The cost went up because the goods were sold along the way from one merchant to anther. The Portuguese found a way to go to India (and China) directly by ship. So when
18:04 the goods came to Europe, there was no merchant-merchant markup. This was the end of Venice. It is now a tourist city (and wonderful) and has been since the 16th century. Goa was, of course, a part of that story.
18:05 bye
18:05 tony37 has quit IRC
18:07 iamutkarshtiwari has left #sugar-newbies

 « Previous day | Index | Today | Next day »     Channels | Search | Join

Powered by ilbot/Modified.
Webmaster