Open in new window / Try shogun cloud
--- Log opened Fri Oct 12 00:00:17 2012
wikinglol primal objective: -1000000000000000019884624838656.00000000:29
wikingmosek mosek00:29
shogun-buildbotbuild #130 of nightly_default is complete: Failure [failed test]  Build details are at
-!- adoniscik [] has joined #shogun07:00
-!- sonne|work [~sonnenbu@] has quit [Ping timeout: 246 seconds]09:30
-!- sonne|work [~sonnenbu@] has joined #shogun09:45
-!- adoniscik [] has quit [Ping timeout: 240 seconds]10:16
-!- Netsplit *.net <-> *.split quits: romi_, sonne|work12:41
-!- Netsplit over, joins: sonne|work, romi_12:45
-!- blackburn [~blackburn@] has joined #shogun13:16
blackburnwiking: you've minimized it quite a lot already!15:24
-!- too [2eda6d52@gateway/web/freenode/ip.] has joined #shogun15:28
toohi there15:28
sonne|workhi there too :D15:30
tooblackburn: hi there, is that you who wrote the CAlphabet::translate_from_single_order" function ?15:30
toosonne|work: looking for some explanation about CAlphabet::translate_from_single_order15:30
sonne|worknope me15:30
sonne|workand gunnar IIRC15:31
toobitwise operations make me crazy :p15:31
sonne|worktoo efficient :DF15:31
sonne|workyeah. idea is basically to squeeze e.g. 2 characters into one byte etc15:32
sonne|workso for DNA you need just 2 bits for A,C,G,T15:32
sonne|workso you can have 4 characters encoded in 1 byte15:33
tooI see. And max value seems to be 2^8 then alphabet of size > 256 seems not possible right now, right ?15:33
sonne|workyes it is for byte alphabets max15:34
sonne|workif you have bigger alphabets you shouldn't use StringByteFeatures but StringWordFeatures etc anyways15:35
sonne|workand then not do this kind of encoding but use the hashing trick15:35
toothe hashing trick ?15:35
sonne|workyeah, compute a hash of your n-characters15:36
sonne|workand store just that15:36
sonne|workit is good enough for any real world app and very fast15:36
toois there any example of this proc with shogun ?15:37
sonne|worktoo: use murmurhash215:38
sonne|workin lib/Hash15:38
toojust to be sure: StringWordFeatures = CStringFeatures<uint64_t> and StringByteFeatures = CStringFeatures<char> ?15:39
tootanks for advice15:43
sonne|worknot uint64_t but uint16_t15:43
sonne|workbut you can use whatever is appropriate for your alphabet15:44
tooallright, then in practice I can make shogun common string kernels work with stringfeatures from bigger alphabet (?)15:48
sonne|workdifficulty depends on kernel you need though15:52
toospectrum kernel for example15:53
sonne|workI would suggest to implement DotFeatures for your feature type - that is the fastest possible way and you can use all linear SVMs (that then train using the spectrum kernel)15:54
sonne|workthere are a couple of examples for that already Hashed* features15:55
-!- too [2eda6d52@gateway/web/freenode/ip.] has quit [Quit: Page closed]16:11
-!- sonne|work [~sonnenbu@] has quit [Quit: Leaving.]17:03
-!- adoniscik [] has joined #shogun19:11
-!- heiko [] has joined #shogun21:03
-!- romi_ [~mizobe@] has quit [Remote host closed the connection]22:22
-!- heiko [] has left #shogun []22:32
--- Log closed Sat Oct 13 00:00:17 2012