Open in new window / Try shogun cloud
--- Log opened Wed Apr 20 00:00:36 2011
-!- blackburn [~qdrgsm@] has quit [Quit: Leaving.]00:30
-!- ameerkat [] has joined #shogun00:38
-!- ameerkat [] has quit [Ping timeout: 276 seconds]01:09
-!- lionelch [4c681efd@gateway/web/freenode/ip.] has quit [Quit: Page closed]01:40
-!- alesis-novik [~alesis@] has joined #shogun01:49
-!- josip [~josip@unaffiliated/josip] has quit [Ping timeout: 246 seconds]01:58
-!- Sabrina [~root@] has joined #shogun02:35
-!- Sabrina [~root@] has left #shogun []02:35
-!- ameerkat [] has joined #shogun05:45
-!- siddharth [~siddharth@] has joined #shogun06:51
-!- lionelch [4c681efd@gateway/web/freenode/ip.] has joined #shogun07:40
-!- lionelch [4c681efd@gateway/web/freenode/ip.] has quit [Quit: Page closed]08:01
-!- ameerkat [] has quit [Ping timeout: 264 seconds]09:19
-!- blackburn [~qdrgsm@] has joined #shogun09:21
-!- dvevre [b49531e3@gateway/web/freenode/ip.] has joined #shogun10:13
-!- siddharth [~siddharth@] has quit [Read error: Connection reset by peer]10:19
-!- blackburn [~qdrgsm@] has quit [Quit: Leaving.]10:45
-!- akhil_ [75d35896@gateway/web/freenode/ip.] has joined #shogun10:51
-!- dvevre [b49531e3@gateway/web/freenode/ip.] has quit [Quit: Page closed]11:08
-!- siddharth [~siddharth@] has joined #shogun11:08
-!- siddharth [~siddharth@] has quit [Ping timeout: 248 seconds]11:23
-!- akhil_ [75d35896@gateway/web/freenode/ip.] has quit [Ping timeout: 252 seconds]11:29
-!- siddharth [~siddharth@] has joined #shogun11:37
-!- josip [~josip@] has joined #shogun12:52
-!- josip [~josip@] has quit [Changing host]12:52
-!- josip [~josip@unaffiliated/josip] has joined #shogun12:52
-!- josip [~josip@unaffiliated/josip] has quit [Ping timeout: 260 seconds]13:59
CIA-110shogun: Soeren Sonnenburg master * r9b62354 / examples/undocumented/python_modular/ : add example for avg perceptron -
CIA-110shogun: Soeren Sonnenburg master * r7ef6961 / src/libshogun/kernel/SalzbergWordStringKernel.cpp :14:08
CIA-110shogun: fix a long standing bug / crasher14:08
CIA-110shogun: free_feature_vector was called with the wrong feature object -
CIA-110shogun: Soeren Sonnenburg master * r76bf7f7 / (2 files): Fix array out of bound errors as detected by valgrind and uninitialized memory reads. -
CIA-110shogun: Soeren Sonnenburg master * re9a7f86 / examples/undocumented/python_modular/ : fix larank example (always C=1 was used before) -
CIA-110shogun: Soeren Sonnenburg master * rc8816f5 / examples/undocumented/python_modular/ : use only single threaded svrlight -
CIA-110shogun: Soeren Sonnenburg master * r0591e84 / examples/undocumented/python_modular/ : kernel should get 2 train objects -
CIA-110shogun: Soeren Sonnenburg master * r60a6071 / data : new tests are required -
@sonney2kfinally all tests run through14:37
-!- Tanmoy [75d35896@gateway/web/freenode/ip.] has joined #shogun14:42
Tanmoyhi all14:42
-!- sploving [sploving@] has joined #shogun15:11
@sonney2ksploving, long time no see...15:14
splovingsonney2k, yeap15:14
@sonney2kglad to see you again.15:14
splovingpreparing for a paper15:14
splovingme too15:14
splovingour org have 5 slots. do you think that is enough?15:15
@sonney2ksploving, enough for what?15:20
@sonney2kwe are actually quite happy to having received so many slots - new orgs usually get 1-215:20
splovingso many proposals too 7015:21
@sonney2kyes true... I have heard of other orgs that had >120 proposals and received only 2 slots though15:21
@sonney2kso we shouldn't complain15:21
splovingI heard that the slots number is fluid until the final announcement15:23
splovingbut most of time, it will not change, right?15:23
@sonney2ksploving, it can only increase (though not in our case)15:23
@sonney2kwell we can give away slots too if we think we don't have enough strong candidates15:24
siddharthhi all15:32
splovinghello siddharth15:33
@sonney2kin the end it is critically important to us that every candidate we get succeeds. otherwise we won't be accepted into upcoming GSoC's...15:34
siddharthsonney2k, I have fixed the errors and included new class CLoss...I have also added the function vector_multiply() have to test the patch15:35
siddharthso should I pull request or after testing?15:35
@sonney2kpull request now and do the testing while I try to have a look at it... no promises that I can do it now though15:36
splovingsonney2k, if an org has some project fail, then in next year, it will have fewer slots, not accepted. This is my understanding15:36
@sonney2ksploving, or not at all be accepted - ubuntu / gimp ...15:36
splovingoh. so strict.15:37
@sonney2k(even though I am not sure if this is really true - that is what I heard though)15:37
siddharthsonney2k, did pull request16:00
siddharthsonney2k, may I test on my patch after 2 I have an important deadline on 22nd...Though I can fix errors in my patch now16:02
siddharththanks :)16:12
-!- akhil_ [75d35896@gateway/web/freenode/ip.] has joined #shogun16:14
@sonney2ksiddharth, looks much better now. What is missing though is to define one abstract CLoss class and then have several classes that derive from there and provie the loss and dloss() functions16:24
-!- dvevre [b49531e3@gateway/web/freenode/ip.] has joined #shogun16:30
siddharthsonney2k, sry was not here16:42
siddharthWhat do u mean by 'several' classes... I mean we can have a derived class which will provide the loss and dloss function16:43
@sonney2ksiddharth, you could have a class CLoss (just as interface) and a class CLogisticLoss derived from it16:56
@sonney2k...that implements the interface16:56
siddharthok will make Closs as abstract class16:58
-!- sploving [sploving@] has left #shogun []16:59
-!- akhil_ [75d35896@gateway/web/freenode/ip.] has quit [Ping timeout: 252 seconds]17:11
@sonney2kdvevre, don't get me wrong I enjoy the discussion and your work.17:25
dvevresonney2k: not at all. very enlightening for me, the discussions with you :)17:26
@sonney2k<deep voice>I will be back</deep voice>17:46
-!- blackburn [~qdrgsm@] has joined #shogun18:45
-!- siddharth [~siddharth@] has quit [Remote host closed the connection]18:46
-!- dvevre [b49531e3@gateway/web/freenode/ip.] has quit []18:56
-!- dave718 [48e50367@gateway/web/freenode/ip.] has joined #shogun18:59
dave718Is it possible to use a masked set of features for training?  E.g. would like to do cross-validation with a RealFileFeatures dataset, but ideally would like to avoid having to generate multiple copies of the dataset.19:00
dave718Is there any way to tell the classifier to exclude a certain range of vectors?19:01
-!- josip [~josip@unaffiliated/josip] has joined #shogun19:05
-!- ameerkat [] has joined #shogun19:16
-!- Mengyun [] has joined #shogun19:33
@sonney2kdave718, heiko is working on this ... should work for string features currently, SimpleFeatures etc will follow soon20:50
-!- ameerkat [] has quit [Ping timeout: 264 seconds]20:51
-!- dave718 [48e50367@gateway/web/freenode/ip.] has quit [Quit: Page closed]20:51
blackburnsonney2k: hi21:00
blackburnhave a bit of time for discuss some CGraph issues?21:01
-!- lionelc_ [4c681efd@gateway/web/freenode/ip.] has joined #shogun21:05
-!- dvevre [~shashwat@] has joined #shogun21:08
-!- dvevre [~shashwat@] has quit [Ping timeout: 246 seconds]21:14
blackburnok, will discuss later ;)21:28
@sonney2kblackburn, discuss what?21:30
blackburnmay be you will have some ideas.. about it's design21:31
blackburncause there are some different ways to represent graph, etc21:31
@sonney2kblackburn, did I miss anything21:32
blackburnsonney2k: eh.. you don't. I mean I'm going to implement graph class21:32
blackburnfor example I don't know to store pointers in graph nodes or to store only some numbers21:32
@sonney2kgraph like in ROC curve or real graph21:32
@sonney2kreal ok21:33
@sonney2kblackburn, it is more efficient to only ever store indices21:33
@sonney2kso have an array of nodes and then have an index into it21:33
blackburnbut there is a need of enumerate objects, etc21:33
blackburnanyway sounds more proper to me too21:34
blackburnsonney2k: do you plan to use graphs somewhere?21:35
blackburnbecause I need to decide what capabilities should it provide21:35
@sonney2knot yet :-) you could predict graphs or compute graph kernels ;)21:36
blackburnsonney2k: will start working on CGraph after ROC21:39
@sonney2kFYI: I just heard on #gsoc that all orgs got the number of slots they requested except for new ones (they got only 1-2 with a few exceptions).21:40
@sonney2kblackburn, you need that for your dim reduction stuff?21:40
blackburnsonney2k: graph? yeap21:40
@sonney2kblackburn, btw, I fixed *all* tests21:41
blackburnit is cool21:41
blackburnhave I done something wrong (of what I don't know yet)?21:41
-!- dvevre [~shashwat@] has joined #shogun21:42
blackburnsonney2k: about your information, why did you say it and how many did you requested?21:42
-!- Mengyun [] has quit [Remote host closed the connection]21:43
@sonney2kblackburn, I don't think so - all good. Heiko added some tests that triggered bugs I had in my code21:44
blackburnHeiko? don't see his commits :)21:45
@sonney2kblackburn, I think it is good to know21:45
@sonney2kblackburn, he is on vactions21:45
@sonney2kvacations that is21:45
blackburnsonney2k: I mean we already know that we the happy ones with 5 slots ;)21:45
blackburnI finally has get rid of my cold but haven't much time for developing just now, will at this weekend..21:47
@sonney2kblackburn, well it was an official statement by the google guys - the other information was just from some paper21:47
blackburnah, I see21:48
blackburnanyway we are lucky21:48
-!- Mengyun [] has joined #shogun21:50
@sonney2kand no conflicts so far.21:56
-!- dvevre [~shashwat@] has quit [Ping timeout: 252 seconds]21:58
blackburnsonney2k: do you mean duplication conflicts?21:59
-!- dvevre [~shashwat@] has joined #shogun21:59
@sonney2kyes - none so far21:59
blackburnhm. forgot to implement precision21:59
blackburnit seems to be important measure21:59
@sonney2kPRC - just use the formula in the python script22:01
blackburnsonney2k: I trust python script with accuracy now :)22:02
@sonney2ksonney2k, at least sth :)22:02
@bettyboosonney2k, funny22:02
-!- Tanmoy [75d35896@gateway/web/freenode/ip.] has quit [Ping timeout: 252 seconds]22:03
blackburnand before PRC I should implement ROC22:04
@sonney2kblackburn, yeah but it could be that the roc script in the c++ file is more correct22:04
blackburnsonney2k: will look at it22:05
blackburnsonney2k: what is the most convenient way to return ROC graph?22:08
blackburnI mean there is irregularity on both axes and we should return points..22:08
@sonney2ksonney2k, ehh irregularity? there should be as many points as there are labels...22:09
blackburnin perfmeasures it did with 2-d array22:09
@sonney2kblackburn, makes sense 2d matrix22:09
blackburnsonney2k: I mean scores are irregular. and we can't return only y-axis22:10
josipsonney2k: as many labels + 1 I think22:10
blackburnokay, will do it the same as in perfmeasures22:10
josipthe code now works with a (N+1)x2 matrix iirc22:11
@sonney2kn+1 - could be22:11
josipas you do n thresholds for each label and one for (0,0)22:11
josipor rather (1,1) depending on what you threshold22:11
@sonney2k(0,0) and (1,1) are definitve end points22:12
@sonney2kblackburn, the script in shogun should compute the convex hull of the roc curve only a realizable one22:14
@sonney2kthe one in the python script has problems when there are multiple outputs that are equal22:15
blackburneh.. okay22:16
@sonney2kblackburn, there is an under / over estimated roc that one can compute and one that is convex hull - read fawcetts paper if you are interested
blackburnalready reading it ;)22:18
blackburnthank you22:18
josipcomputing the convex hull is faster than the trapezoid algorithm?22:20
josipignore my previous question22:21
-!- dvevre [~shashwat@] has quit [Ping timeout: 240 seconds]22:21
-!- dvevre [~shashwat@] has joined #shogun22:22
@sonney2kblackburn,  in principle one would have to do it for PR curve too - but no idea how that works22:22
@sonney2kjosip, equivalent :)22:22
blackburnsonney2k: ok, will see, now will just make ROC, later auROC22:24
@sonney2kblackburn, when you have the ROC Curve auROC is easy22:24
blackburnit seems so, but don't understand about decision between trapezoid and convex hull..22:25
blackburnwill understand it after some readings ;)22:26
blackburnsonney2k: btw, seems ROC will not be implemented in ContingentTableEvaluation22:30
@sonney2kblackburn, of course not :)22:30
@sonney2kextra class22:30
@sonney2klike PRC22:30
blackburndid it, but have been thinking about its merging - and it makes no sense at all22:30
@sonney2kblackburn, :)22:31
-!- josip [~josip@unaffiliated/josip] has quit [Read error: Connection reset by peer]22:33
-!- josip [~josip@unaffiliated/josip] has joined #shogun22:34
blackburnwill implement it tomorrow on 'equations of math.physics' lectures at university :D22:35
-!- dvevre [~shashwat@] has quit [Ping timeout: 240 seconds]22:36
-!- dvevre [~shashwat@] has joined #shogun22:37
CIA-110shogun: Soeren Sonnenburg master * r6355705 / src/libshogun/lib/BinaryFile.cpp :22:43
CIA-110shogun: turn total_size into size_t type (array can be >2GB!) and add22:43
CIA-110shogun: whitespaces to improve readability (+5 more commits...) -
blackburnhm. 4222:47
blackburnsonney2k: caught you!22:49
@sonney2kblackburn, 42? what?22:50
blackburnaha :)22:50
@sonney2kyes that fixes a warning though that is code that won't be reached22:50
blackburnsonney2k: just saw it and wondered was ist das :)22:51
@sonney2kyeah it is strange...22:52
-!- dvevre [~shashwat@] has quit [Ping timeout: 248 seconds]22:59
blackburnsonney2k: how can I help you with project in may?23:02
blackburnor it was an another joke? :D23:03
@sonney2kblackburn, I am not sure I understand the question? You don't have enough tasks to solve yet?23:04
blackburnsonney2k: so it seems to be an another joke :) you said I could help you after your child is born23:04
@sonney2kblackburn, ahh well discuss with the students that are looking for help of course and solve their problems :)23:05
blackburnstudents? eh?23:06
* sonney2k is looking at the commit count23:07
blackburnah, nevermind, another joke I accepted as serious offer ;)23:07
@sonney2kyou have 72 commits already - by far more than all others23:07
blackburnI mean there won't be more than 6-7 students23:07
blackburn*may be some will stay23:08
blackburnbut possibility is about zero, yeah ;)23:08
blackburnsonney2k: and how can I help them? there would be mentors for it, etc23:09
@sonney2kof course I will try to give some feedback - but it is tough in the first 2 weeks. won't have much if at all sleep23:09
@sonney2kblackburn, usually it helps to just discuss about ideas with someone23:10
@sonney2kideally all the accepted students discuss publicly on the mailinglist about their plans23:10
blackburnsonney2k: in that case of course i would be happy to help you23:10
@sonney2kand write (very short!!!) what they did/plan to do this week23:11
blackburnbut I haven't much authority for doing this just like you23:11
@sonney2kand of course IRC for faster round trip times23:11
@sonney2kI do lots of mistakes...23:11
@sonney2kall that is needed/wanted is some common sense23:12
blackburnsonney2k: do you think posting weekly plan to ML is a good idea?23:12
@sonney2kin the beginning definitely23:12
@sonney2kwhen everyone is working as expected we might loosening this criterion23:12
@sonney2krecall that we need 100% success ....23:13
blackburnokay, induced me ;)23:15
blackburnsonney2k: I have an idea of develop some latex template for weekly report, how you mind it?23:16
@sonney2kblackburn, ascii to the mailinglist only23:16
blackburnas you wish :P23:16
@sonney2kno one can read more then a few sentences23:16
@sonney2ks/can/wants to/23:16
josipanother idea is to keep a public blog23:17
josip+ code reviews23:17
josipwe did something like this last year and it worked nicely, as you get plenty of feedback23:17
josipand put all of the blogs on a planet (the aggregation software)23:17
@sonney2kjosip, yeah but a mailinglist at least has the interested people reading what you write23:18
@sonney2kjosip, we are not big enough to have that...23:18
@sonney2kjosip, that would make a lot of sense for e.g. debian or such big orgs23:18
josipyes, it's not that it's hard to track 5 blogs, but due to the laziness factor it might be convenient23:19
josipand it's super easy to setup23:19
@sonney2kthe machine learning community is already a niche and shogun is even more23:19
josipwell, if people are used to the ml, then that might be best I suppose23:19
* blackburn likes the idea of blogs23:20
@sonney2khmmhh is any of you seriously blogging?23:20
* josip not23:20
* sonney2k not23:20
* blackburn not23:20
@sonney2kok then mailinglist23:20
josipbut gsoc-specific blogs might be doable23:20
josiphehe, yeah :)23:21
blackburnsonney2k: have some fixes for already implemented ContingencyTableEvaluation23:21
blackburnsonney2k: shall I make a pull request for it or unite it with ROC?23:22
lionelc_I think it would be a good idea to have a blog such as or wordpress.com23:22
@sonney2kblackburn, make a pull request - the smaller the better23:22
@sonney2klionelc_, I am not against having a blog if you guys blog about what you do (or someone blogs at all).23:23
@sonney2kIt is just that I am too lazy for that and writing an email with 3-5 sentences is much easier to *me* personally23:23
josipwell, because weekly reports can get lengthy we did this last year: brief summaries on the mailing list and the long version on the blog23:25
lionelc_yep... I mean in a blog students at least can post html instead of plain ascii, possibly with some social buttons. a "like" button may make students more motivated23:26
lionelc_sonney2k: so for a quick review of 3-5 sentences from mentors, it can be done via comments23:26
@sonney2kalright then lets have both and some kind of planet shogun thing23:26
@sonney2kjosip, any idea how to set that up23:27
@sonney2kI wouldn't even mind to include students willing to contribute but not accepted into GSoC23:27
josip if I'm not mistaken23:28
@sonney2kbut I don't have a blog and I am to lazy anyways to write text for humans :)23:30
-!- dvevre [~shashwat@] has joined #shogun23:30
josiphehe, and you can set up free blogs at posterous/blogspot/whatnotelse23:31
blackburnsonney2k: here23:31
blackburnand yes I don't know why I forgot to write _full_ doc for it :D23:32
blackburnsonney2k: omg, you make me ashamed :)23:34
@sonney2kblackburn, many small typos ...23:34
@sonney2kshogun has probably the most weird english like documentation ever written by exclusively non-natives ;-)23:35
@sonney2kblackburn, I am not excluding myself here23:36
blackburnmy english is f-ng awful23:37
@sonney2kyou need to share a room with japanese23:37
josiplol :)23:37
blackburnwill it help? :D23:37
josipI think there's only one native spaker here23:37
@sonney2kthis is how I learned english (it really forced me to speak)23:37
josipsonney2k: you learned english from writing docs?23:38
* sonney2k hates writing docs23:38
josipI have to improve my German over the summer, so that might be one way of doing it hehe :)23:38
@sonney2kthe problem is when you write the code it is only you who can document it23:38
@sonney2kjosip, german documentation is pretty useless23:39
@sonney2kjosip, but feel free to translate shogun's documentation to german :-D23:39
lionelc_it always feels bad in writing docs but at the same time, you also hate to have no docs to read when tracing some new codes :-)23:40
-!- josip [~josip@unaffiliated/josip] has quit [Quit: Changing server]23:40
-!- josip [~josip@] has joined #shogun23:40
-!- josip [~josip@] has quit [Changing host]23:40
-!- josip [~josip@unaffiliated/josip] has joined #shogun23:40
josipso what's the last thing you read from me? I managed to plug off the router with my foot23:41
@sonney2kwe are not overly strict with that either, an exampe for all the languages and the doxygen doc - thats it mostly23:41
@sonney2k<josip> I have to improve my German over the summer, so that might be one way of doing it hehe :)23:41
@sonney2k<sonney2k> the problem is when you write the code it is only you who can document it23:41
@sonney2k<sonney2k> josip, german documentation is pretty useless23:41
@sonney2k<sonney2k> josip, but feel free to translate shogun's documentation to german :-D23:41
@sonney2k<lionelc_> it always feels bad in writing docs but at the same time, you also hate to have no docs to read when tracing some new codes :-)23:41
@sonney2k<bettyboo> :>23:41
@bettyboosonney2k: :>23:41
blackburnsonney2k: fixed it23:41
josipthe docs are a very important part of the code23:42
josipthere was a very funny stackoverflow question with 'funniest comments'23:42
CIA-110shogun: Sergey Lisitsyn master * red5f12f / (2 files): Added documentation, added precision measure -
CIA-110shogun: Sergey Lisitsyn master * rc3d602c / (2 files): Fixed shameful mistypes -
CIA-110shogun: Soeren Sonnenburg master * r02c32c3 / (2 files): Merge branch 'master' of -
lionelc_sonney2k: just curious, why there is a Chinese-version tutorial of shogun? it seems to be the only non-English version right now23:43
@sonney2klionelc_, well a chinese volunteered to translate it23:44
@sonney2kand I said why not23:44
* blackburn is going to sleep after a shameful commits he done :D23:44
blackburnsee you23:44
@sonney2kafter he managed I became scared what he might have written23:44
lionelc_I see.23:45
@sonney2kso I ran everything through google translate - and indeed it matches the topic :D23:45
@sonney2kblackburn, have a nice sleep23:45
blackburnsonney2k: good night23:45
-!- blackburn [~qdrgsm@] has quit [Quit: Leaving.]23:45
lionelc_blackburn: have a nice sleep. see you here tomorrow23:45
@sonney2kjosip, that url is hilarious :)23:46
@sonney2klink that is23:46
lionelc_sonney2k: you mean you ran everything via google English-Chinese translate?23:46
@sonney2klionelc_, how would you check that it is not curses and complaints?23:47
lionelc_sonney2k: ummm... never did so. but I think I can set a range of key words, and do pattern matching23:48
@sonney2klionelc_, what does tell you?23:49
@sonney2kI cannot even read anything23:49
@sonney2kexcept short double etc23:49
@sonney2kwhich reminds me that this page is outdated23:49
lionelc_sonney2k: obviously there are no courses/complaints there23:50
@sonney2kheh :D23:51
lionelc_just a description of what Shogun can do. and yes, it is outdated23:51
lionelc_sonney2k: the "installation" part is too short, which can distract many potential Chinese users if they fail in installation23:52
@sonney2kI welcome any doc related contribution!23:53
josipsonney2k: do you happen to know any good ML people at ETHZ?23:53
josipI know Andreas Krause is there and they have the ML Group with Prof. Buhmann . Anyone else?23:53
@sonney2kI know Cheng Soon Ong there - he is also with JB23:54
josipand Peter Buhlmann from the Stats group23:54
josiphm, thanks. It seems like a nice place23:54
lionelc_sonney2k: also, if some tutorials on SVM itself can be added there, at least some links, then they can be helpful23:56
josipI hope I'll like it23:56
@sonney2klionelc_, patches welcome23:56
@sonney2klionelc_, I guess one could write a book about all the things in shogun...23:57
lionelc_sonney2k: lol. what I proposed should be done by the webmaster, who can edit the webpages23:57
lionelc_sonney2k: I think you have someone who specially manages the website?23:57
@sonney2klionelc_, no I don't want to do this23:58
@sonney2kand btw. the website is generated from doxygen docs23:58
@sonney2kin doc/pages dir23:59
lionelc_sonney2k: I see23:59
@sonney2klionelc_, so patches welcome :D23:59
--- Log closed Thu Apr 21 00:00:14 2011