AltME: Announce

Messages

Maxim
The Stone DB is consuming a lot of my time but its moving forward pretty nice...  current single thread (in-RAM) imports run at 10 million nodes per second  using an average node payload of 40 bytes (which is longer than the average I'd typically use).  The majority of the time is spent verifying internal dataset  integrity and memory copying.
it takes 3 seconds to basically grab all available process RAM (2GB ) and create 30 million data nodes.    1 million nodes takes 50ms on average, I'm getting pretty flat scaling so far, which is a very good sign .  note that the data is completely memory copied into the DB .  I'm not pointing to the original import data.
all of these benchmarks, are not even using a dedicated import function... this is like the worst case scenario for import.  its a dumb FOR loop using  a fully bounds checking single insert-node() function.... if I did an import loop which only does the bounds checking and stores counters in a loop, I likely can scale the import a lot.  
I'm now starting work on the higher level interfaces, basically creating database setup on the fly and hopefully by friday I should have the file i/o started.
maybe next week I'll start to see how I can create a native Stone DB interface for R3.
TomBon
nice tech you are doing there maxim. count me in for some big data tests. i never used graph DBs before but would like to give it a try
for a non scalable setup, suboptimal solved via simple key traversal stored into a nosql core currently.

Rebolek
I've put my old regex engine on github http://github.com/rebolek/r2e2 so anyone can improve on it.

Kaj
For the Red JSON converter, I implemented TO-HEX and LOAD-HEX in ANSI.red:
http://red.esperconsultancy.nl/Red-C-library/dir?ci=tip
TO-HEX is like REBOL, it has a /size refinement to specify the number of hex digits:
red>> to-hex 1023
== "000003FF"
red>> to-hex/size 1023 4
== "03FF"
red>> load-hex "3ff"
== 1023

Ladislav
I merged 0.4.40 (64-bit R3 for linux) to community. Merry Christmas.
Maxim
does this include any view code?

Kaj
I made many improvements to the TNetStrings and JSON converters for Red:
http://red.esperconsultancy.nl/Red-TNetStrings/dir?ci=tip
http://red.esperconsultancy.nl/Red-JSON/dir?ci=tip
- Floating point numbers are now parsed and loaded as file! types, so external data with floats can at least be loaded and the numbers can be detected, so they could be processed further by your own functions.
red>> load-JSON "6.28"
== %6.28
- char! type is now more explicitly supported, in the sense that single character strings will be loaded as char! so they are more efficient.
red>> load-JSON {["a", 9, "bc", 42]}
== [#"a" 9 "bc" 42]
- object! type is now supported, so it becomes easier to emit TNetStrings with nested dictionaries and JSON data with nested objects. The converters can still (and need to) be compiled: they use the interpreter only very sparingly for objects support.
red>> load-JSON/objects {{"a": 9, "bc": 42}}
== make object! [
    a: 9
    bc: 42
]
red>> print to-JSON context [a: 9 b: 42]
{
    "a": 9,
    "b": 42
}
- bitset! type is now supported in the emitter. Small bitsets that fit in bytes are emitted as character lists.
red>> print to-JSON/flat s: charset [#"0" - #"9"]
["0","1","2","3","4","5","6","7","8","9"]
Complemented bitsets are not explicitly supported because they would be too large.
red>> print to-JSON complement s
"make bitset! [not #{000000000000FFC0}]"
Larger bitsets are emitted as integer lists.
red>> print to-JSON charset [100 1000]
[
    100,
    1000
]
- All Red data types can now be emitted. Not explicitly supported types are FORMed.
- Several new refinement options, in particular for object support.
red>> load-JSON/values {["#issue", "%file", "{string}"]}
== [#issue %file "string"]
Loading JSON objects and TNetStrings dictionaries still defaults to generating Red block!s.
red>> load-JSON {{"a": 9, "bc": 42}}
== [#"a" 9 "bc" 42]
red>> load-JSON/keys {{"a": 9, "bc": 42}}
== [a 9 bc 42]
- More efficiency optimisations. The converters use a minimum of memory.
- Unicode escapes in JSON strings are now fully supported.
red>> load-JSON {"Escapes: \"\\\/\n\r\t\b\f\u0020\u0000"}
== {Escapes: "\/^/^M^-^H^L ^@}
red>> print to-JSON {Controls: "\/^/^M^-^H^L^@}
"Controls: \"\\/\n\r\t\b\f\u0000"
red>> print to-JSON make char! 1
"\u0001"
- The JSON converter now implements the full specification on json.org except escaped UTF-16 surrogate pairs. There is little reason for them to occur in JSON data.
Kaj
The JSON converter is still smaller than the official R2 implementation. It's now larger than the R3 implementation, but has more features. It's still an order of magnitude smaller than most JSON implementations in other languages.

Maxim
StoneDB is starting to take shape.  I got the preliminary disk storage prototype finished today.  I can't give factual speed benchmarks since for now I've got no time to do extensive testing... but it seems to be able to store at least 500000 nodes a second (@about 14MB/s), which is pretty decent for a prototype using default C disk writing functions and absolutely no regards for disk i/o profiling.  this is even more acceptible considering its running on a lame notebook disk.  (I should have a SSD after the holidays, so I'll be able to compare :-)
with the current architecture, I should be able to read any cell directly from disk so query set can be larger than physical RAM.
If all goes well, I should have persistent read/write access to the DB's file data done by the time I go to bed tonight   .....   yay!
After that... cell linking which will require a different variable length dataset driver.  This new one will allow perpetual appending without any need to copy memory  :-)

Kaj
I updated Red on Try REBOL with the latest Red fixes and the latest version of my JSON and TNetStrings converters:
http://tryrebol.esperconsultancy.nl
Here's a fun example to try:
print json: read "http://api.bitcoinaverage.com/ticker/global/USD"
print to-JSON probe load-JSON/objects json
{
  "ask": 637.43,
  "bid": 635.18,
  "last": 636.42,
  "timestamp": "Wed, 25 Dec 2013 03:06:15 -0000",
  "volume_btc": 32298.43,
  "volume_percent": 41.27
}
make object! [
    ask: %637.43
    bid: %635.18
    last: %636.42
    timestamp: "Wed, 25 Dec 2013 03:06:15 -0000"
    volume_btc: %32298.43
    volume_percent: %41.27
]
{
    "ask": "637.43",
    "bid": "635.18",
    "last": "636.42",
    "timestamp": "Wed, 25 Dec 2013 03:06:15 -0000",
    "volume_btc": "32298.43",
    "volume_percent": "41.27"
}

Rebolek
I've released new version of my Redis protocol on https://github.com/rebolek/prot-redis . Now it's R3 module, adds some functions to make life easier (see documentation) and also few small bugfixes have been done.

Bo
NickA has agreed to allow his excellent "Learn Rebol" book to be published in "installments" in ODROID Magazine.  Expect the first article to appear in the February 2014 issue which will be available at http://magazine.odroid.com .
This will help open Rebol up to a new community of enthusiasts.

Ashley
Not quite ready for announce, but I've been working on a universal data manipulator type thingy which I've tentatively named 'munge. Reason I created it is I find myself using the following idiom quite often:
    CSV file -> REBOL (do [%csv-tools.r %sqlite.r]) -> SQLite -> REBOL -> CSV file
which (for files up to about 100,000 rows) I'd like to replace with:
    CSV file -> REBOL (do %munge.r) -> CSV file
Here's the help text:
USAGE:
    MUNGE data size /header /where condition /part columns /delete /update action /unique /list /only /save file delimiter
DESCRIPTION:
     Manipulate tabular values in blocks and delimited files.
     MUNGE is a function value.
ARGUMENTS:
     data -- (Type: block file url)
     size -- Size of each record (Type: integer)
REFINEMENTS:
     /header -- Ignore first row
     /where
         condition -- Expression(s) that can reference columns as A, B, etc (Type: block)
     /part
         columns -- Offset position(s) to retrieve (Type: integer block)
     /delete -- Delete matching rows (returns original block)
     /update
         action -- Update offset value pairs (returns original block) (Type: block)
     /unique -- Returns sorted unique records
     /list -- Return new-line records
     /only -- Return after matching a single row (ignored by /delete and /update)
     /save -- Write result to a delimited file
         file -- (Type: file)
         delimiter -- (Type: char)
and an example of some of the fun things you can already do with it:
test: copy ["Name" "Age" "Bob" 33 "Joe" 55 "Joe" 55]
munge/header/where/list test 2 [B = 55]
munge/header/unique/part test 2 1 ; this does not alter the block
munge/header/unique test 2 ; this alters the block
write/string %test.csv "Name,Age^/Bob,33^/Joe,55^/Joe,55"
munge/header/where/list %test.csv 2 [B = "55"] ; file values are all string
test: munge %test.csv 2
munge/header/update test 2 [2 [to integer! B]] ; update col 2 with val / block
munge/header/delete/where test 2 [B <> 55]
munge/part/save %test.csv 2 [1 2 2] %test2.csv #"," ; save with an additional column
Beta can be found here: https://dl.dropboxusercontent.com/u/8269768/munge.r
Appreciate feedback on the name, scope and code.
Paul
Hi Ashley, that is kinda what I was doing with the original concept behind my first Tretbase db.  It was to allow someone to do that.  I think the scope is a great idea.  Keep up the great work!

Oldes
We are between finalists on IGF 2014 with our new game we are working on (using REBOL as an important  building tool) http://igf.com/2014/01/2014_independen.html

Ashley
munge.r updated. Now supports reading from and saving to Excel*.
    munge/save ["a" 1 "b" 2] 2 %test.xml
    munge %test.xlsx
* on windows with R2 and Office installed
Split off block loading into a separate function for those requiring more options when loading from file:
        write %test.csv "Name,Age^/Bob,33^/Joe,55^/Joe,55"
        load-block %test.csv
        load-block/skip %test.csv 1                         ; ignore header row
        load-block/skip/coerce %test.csv 1 [2 integer!]     ; ignore header and coerce 2nd column to integer
        write %test.csv "Name^-Age^/Bob^-33^/Joe^-55^/Joe^-55"
        load-block/delimit %test.csv #"^-"                  ; use tab as a delimiter
        load-block %test.xlsx                               ; read 1st worksheet of an Excel spreadsheet
        load-block/sheet %test.xlsx 2                       ; read 2nd worksheet
and same for Excel options when creating workbooks / sheets:
        open-workbook
        add-worksheet/sheet [String Number] ["Name" "Age" "Bob" 33 "Joe" 44] "My Sheet"
        add-worksheet/footer [String Number Number Number] [
            "Name" "A" "B" "Total"
            "Bob" 1 2 "=RC[-1]+RC[-2]"
            "Joe" 3 4 "=RC[-1]+RC[-2]"
        ] [none none "Grand Total" "=SUM(R[-?]C:R[-1]C)"]
        save-workbook %test.xml

Ashley
Munge released (and documented) at http://dobeash.com/munge.html
Notable additions include support for reading from and writing to SQL Server tables (without the need for any temporary files) and the inclusion of a flip refinement to swap columns and rows. Enjoy!

Last message posted 110 weeks ago.