AltME: Databases

Messages

afsanehsamim
hey guys... i have just 2days time for my project ! could you help me?
i could not do the last step ... i should show result of comparing values on web page

TomBon
a quick update on elasticsearch.
Currently I have reached 2TB datasize (~85M documents) on a single node.
Queries now starting to slow down but the system is very stable even under
heavy load. While queries in average took between 50-250ms against a
dataset around 1TB the same queries are now in a range between 900-1500 ms.
The average allocated java heap is around 9GB which is nearly 100% of the
max heap size by a 15 shards and 0 replicas setting.
elasticsearch looks like a very good candidate for handling big data with
a need for 'near realtime' analysis. Classical RDBMS like mysql and postgresql
where grilled at around 150-500GB. Another tested candidate was MongoDB
which was great too but since it stores all metadata and fields uncompressed
the waste of diskspace was ridiculous high. Furthermore query execution times
differs unexpectable without any known reason by factor 3.
Tokyo Cabinet started fine but around 1TB I have noticed file integrity problems
which leads into endless restoring/repairing procedures. Adding sharding logic
by coding an additional layer wasn't very motivating but could solve this issue.
Within the next six months the datasize should reached the 100TB mark.
Would be interesting to see how elasticsearch will scale and how many
nodes are nessesary to handle this efficiently.
Maxim
when you talk about "documents" what type of documents are they?
Gregg
Thanks for the info Tomas.
TomBon
crawled html/mime embedded documents/images etc. as plain compressed source (avg. 25kb) and 14 searchable metafields (ngram) to train different NN types for pattern recognition.
Maxim
thanks  :-)

MaxV
I have a problem with RebDB: how works db-select/group?
Example:
>> db-select/where/group/count [ID title post date]  archive  [find post "t" ] [ID]
** User Error: Invalid number of group by columns
** Near: to error! :value
Endo
Don't you need to use aggregate functions when you grouping?
* when you use grouping.
Scot
I use the sql dialect like this:
sql [select count [ID title post date] from archive group by [ID title post] where [find post "t"]]
The trick with this particular query is the that the "count" selector must have exactly one more column than the "group by" selector.  The first three elements [ID title post] are used to sort the output and the last element [date] is counted.
output will be organized:
    ID  title   post    count
I would like to be able to include other columns in the output that are not part of the grouping or count, but I haven't figured out how to do this in RebDB.  I have used a parse grammar on the output to achieve the desired result.
I would also like to query the results of a query, which I haven't figured out how to do so without creating and committing a new database.  So I have  used a parse grammar to merge two queries.

Pavel
SQLite version 4 announced/proposed. The default built-in storage engine is a log-structured merge database instead of B-tree in SQlite3. As far as I understand the docs This store could be usable standalone or use SQL frontend. Google to SQLite4.
Kaj
Cool

Endo
I cannot see any announcement on the sqlite.org web site? SQLite 3.7.17 is the latest and recommended version?
Kaj
I saw code last year, but it's probably still in deep development

Pavel
Endo as I wrote google for SQLite4. direct link is: http://sqlite.org/src4/doc/trunk/www/design.wiki. There is a mirror of souces at https://github.com/jarredholman/sqlite4 also.

Pekr
Has anyone tried to work with ODBC under R3? I somehow can't load following ODBC driver DLL: https://github.com/gurzgri/r3-odbc
Or differently, has anyone worked with excel files via ODBC, using either R2 or R3? I tried Graham's code, which works for .xls files, but not .xlsx files. When I convert my file to .xls, R2 returns - not enough memory :-(
p: open [
     scheme: 'ODBC
     target: "Driver={Microsoft Excel Driver (*.xls)};DriverId=790;Dbq=c:\path-to-file\file.xls"
]
conn: first p
insert conn "select * from [Sheet1$]"
result: copy conn
As for R3 - maybe there was also some other R3 ODBC extension, somehow can't find it ....

Last message posted 341 weeks ago.