Newsfeeds
Ulei de tea tree
Planet MySQL - http://www.planetmysql.org/
-
Fun with Bugs #44 - Community Bugs Fixed in MySQL 5.7.14
MySQL 5.7.14 was officially released yesterday. So, it's time to check what bugs reported by MySQL Community in public were fixed in this release. Some of these bugs are presented below.As usual, let me start with InnoDB. The following bugs were fixed there:Bug #80296 - "FTS query exceeds result cache limit". It was reported (for 5.6, but I do not see new 5.6 release notes yet) by Monty Solomon and verified by Umesh.Bug #80304 - "generated columns don't work with foreign key actions". It was reported by Guilhem Bichot based on test case by Peter Gulutzan presented here.As most community bug reports during last 2-3 years, it was verified by Umesh.Bug #80298 - "Full-Text queries with additional secondary index gives NULL or Zero rows", was reported by Ray Lambe and verified by Umesh.Bug #76728 - "reduce lock_sys->mutex contention for transaction that only contains SELECT". This old bug report by Zhai Weixiang (who had provided a patch) was verified by Sinisa Milivojevic.Bug #80083 - "Setting innodb_monitor_enable to ALL does not enable all monitors". It was reported by Davi Arnaut and verified by Miguel Solorzano.Bug #79772 - "Foreign key not allowed when a virtual index exists". It was reported and verified by Jesper wisborg Krogh from Oracle.There are many more bugs fixed in InnoDB, but all of them were reported in internal Oracle's bugs database by Oracle employees. I do not like this trend.Now, let's check replication bugs that were fixed:Bug #79324 - "Slave is ~10x slower to execute set of statements compared to master RBR", was reported by Serge Grachov and verified by Umesh.Bug #62008 - "read-only option does not allow inserts/updates on temporary tables". This bug was reported long time ago by Ivan Stanojevic and verified by me when I worked in Oracle.It's really good to see it fixed now!Some bugs were fixed in Performance_Schema (who could imagine it has bugs...), but they were either reported internally or remain private, like Bug #81464. Just take into account that SELECT from some P_S tables could crash server before 5.7.14, based on release notes... This time I see several build-related bugs fixed, like these:Bug #81274 - "Add support for Solaris Studio 12.5 aka 5.14". It was reported (and probably fixed) by Tor Didriksen.Bug #81593 - "adapt to gcc 5.3 on solaris". It was also reported and fixed by Tor Didriksen. personally I am happy to see that Oracle still cares about Solaris and related software. Historical sentiments...Bug #80996 - "correct make_pair for c++11 (contribution)". This fix was contributed by Daniel Black. Bug #80371 - "MySQL fails to build with new default mode in GCC6". It was reported by Terje Røsten. The last but not the least, I also have to mention this bug in audit (and, thus, query rewrite) plugins, Bug #81298 - "query rewrite plugin suffers scalability issues". It was reported by Vadim Tkachenko and verified by Sinisa Milivojevic. This is a great improvement.To summarize, I see reasons to upgrade for those who rely a lot on FTS indexes in InnoDB, replication, audit plugins and Performance_schema. I had not even tried to build 5.7.14 from source yet, so I do not have any personal experience to share.
-
WarpSQL now has SQL shim plugins
I made some improvements to the 'proxy' inside of MySQL 5.7 that I've created for WarpSQL (Shard-Query 3). I've made the MySQL proxy/shim pluggable and I moved the SQLClient to sql/sql_client.cc. I've merged these changes into 'master' in my fork.Now you can create "SQL shim" plugins (SHOW PASSWORD is implemented in plugin/sql_shim) and install them in the server like regular plugins: -- command doesn't work mysql> show password; ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'password' at line 1 -- install the example sql_shim plugin: mysql> install plugin sql_shim soname 'sql_shim.so'; Query OK, 0 rows affected (0.00 sec) -- now the command works mysql> show password; +--+ | | +--+ | | +--+ 1 row in set (0.00 sec) There can be only one!There may only be ONE "SQL shim" plugin running in the server at one time. All "SQL shim" plugins must use the plugin name "sql_shim". This prevents more than one "SQL shim" plugin from being utilized at once. This is by design, because the MySQL plugin interface doesn't make any provisions for plugin execution order. If you install a SQL shim plugin by another name, it won't cause harm, but it will not be utilized.
-
MariaDB 10.2 CHECK and DEFAULT clauses
In this blog post, we’ll look at the MariaDB 10.2 CHECK and DEFAULT clauses. MariaDB 10.2 includes some long-awaited features. In this blog, we are going to discuss the improvements to some table definitions: the DEFAULT clause and the CHECK constraints. These clauses describe columns default values and rules for data validation. Note that MariaDB 10.2 is still in alpha stage. This article describes the current state of these features, which could change before MariaDB 10.2 becomes GA. The DEFAULT clause The DEFAULT clause has always been supported in MariaDB/MySQL, but traditionally it only accepted literal values (like “hello world” or “2”). MariaDB 10.2 removes this limitation, so DEFAULT can now accept most SQL expressions. For example: fiscal_year SMALLINT DEFAULT (YEAR(NOW())) valid_until DATE DEFAULT (NOW() + INTERVAL 1 YEAR) owner VARCHAR(100) DEFAULT (USER()) Additionally, MariaDB 10.2 allows you to set a DEFAULT value for the TEXT and BLOB columns. This was not possible in previous versions. While this might look like a small detail, it can be hard to add a column to an existing table that is used by production applications, if it cannot have a default value. The DEFAULT clause has some very reasonable limitations. For example, it cannot contain a subquery or a stored function. An apparently strange limitation is that we can mention another column in DEFAULT only if it comes first in the CREATE TABLE command. Note that DEFAULT can make use of non-deterministic functions even if the binary log uses the STATEMENT format. In this case, default non-deterministic values will be logged in the ROW format. CHECK constraints CHECK constraints are SQL expressions that are checked when a row is inserted or updated. If this expression result is false (0, empty string, empty date) or NULL, the statement will fail with an error. The error message states which CHECK failed in a way that is quite easy to parse:ERROR 4022 (23000): CONSTRAINT `consistent_dates` failed for `test`.`author`Some example of CHECK constraints: CONSTRAINT non_empty_name CHECK (CHAR_LENGTH(name) > 0) CONSTRAINT consistent_dates CHECK (birth_date IS NULL OR death_date IS NULL OR birth_date < death_date) CONSTRAINT past_date CHECK (birth_date < NOW()) A possible trick is checking that a column is different from its default value. This forces users to assign values explicitly. CHECK constraints cannot be added or altered. It is only possible to drop them. This is an important limitation for production servers. Another limitation is that CHECK metadata are not accessible via the INFORMATION_SCHEMA. The only way to find out if a table has CHECK clauses is parsing the output of SHOW CREATE TABLE. The exact behavior of CHECK constraints in a replication environment depends on the master binary log format. If it is STATEMENT, the slaves will apply CHECK constraints to events received from the master. If it is ROW, only the master will need to apply constraints, because failed statements will not be replicated. Thus, in all cases, we recommend having identical constraints on master and slaves, and only using deterministic constraints. Performance While I didn’t run a professional benchmark, I can say that both DEFAULT and CHECK clauses don’t have a noticeable impact on a simple test where we insert one million rows (on my local machine). However, these clauses evaluate an SQL expression each time a row is inserted or updated. The overhead is at least equal to the SQL expression performance. If high-performing writes are important, you will probably not want to use complex data validation. To check how fast an expression is, we can use the BENCHMARK() function:MariaDB [(none)]> SELECT BENCHMARK(10000000, (555 / 100 * 20)); +---------------------------------------+ | BENCHMARK(10000000, (555 / 100 * 20)) | +---------------------------------------+ | 0 | +---------------------------------------+ 1 row in set (1.36 sec) MariaDB [(none)]> SELECT BENCHMARK(100000000, MD5('hello world')); +------------------------------------------+ | BENCHMARK(100000000, MD5('hello world')) | +------------------------------------------+ | 0 | +------------------------------------------+ 1 row in set (14.84 sec)In this example, we executed the specified expressions ten million times. BENCHMARK() always returns 0, but what we want to check is the execution time. We can see for example that evaluating MD5(‘hello world’) takes less than 0.000002 seconds. In some cases, we may want to retry the same expressions with different parameters (longer strings, higher numbers, etc.) to check if the execution times varies. Unfortunately, we don’t have a status variable which tells us how many times MariaDB evaluated CHECK clauses. If our workload performs many writes, that variable could help us to find out if CHECK constraints are slowing down inserts. Maybe the MariaDB team can take this as a suggestion for the future.
-
Faceted search, why the DevAPI could matter one day
Faceted search or faceted navigation is a highly praised and widely use search pattern. And, it is a great reply to an off the records sales engineering question. MySQL finally has some document store features built-in. A bit of a yawn in 2016. There is a new X DevAPI available with some Connectors. A bit of a yawn technically. But it is a non-technical change of mind: developer centric counts! Sales, all, technical value could show at non-trivial developer tasks, like faceted search. Todays X DevAPI does not get you very far There are great stories to tell about the X DevAPI, see MySQL 5.7.12 – Part 3: More Than “Just” SQL?: Non-blocking API CRUD API for schemaless documents in collections and schemaful rows in tables SQL support Prepared for “we need new APIs” for distributed databases Most importantly the “look and feel” is similar to comparable offerings from NoSQL vendors. Competitive offerings have been described as easy, natural, developer friendly. We tried to follow these maximes. Albeit a leap step forward the feature set of the first version of the X DevAPI is limited. It works for basic apps like the demo app. But instead of adding a bulk of additional features we broaden our view to achieve the goal of improving developer ease of use on a whole: Out-of-the box experience X Shell – new commandline shell which supports scripting User guides, tutorials, demo app … Selling point readability The off the records sales engineers question was: why would one use the X DevAPI, given that it is only a dump SQL wrapper? Let a developer write some code to fetch all products from a collection that have a price higher than 1.2 Euros: products.find("price > 1.2").exeucte(); Upon execution, the Connectors (drivers) send the query to the X server plugin which translates it to some SQL and executes it. The SQL statement makes use of ANSI/ISO SQL standard JSON features which MySQL 5.7.12+ supports: SELECT product FROM products WHERE JSON_UNQUOTE(JSON_EXTRACT(product, "$.price")) > 1.2; There is no technical value in this. Schemaless? Works with SQL. JSON processing? Works with SQL, too. The X DevAPI selling point is readability. But none of the available features today and none of feature on the short term roadmap has the potential to add more value to the X DevAPI. Faceted search: let daddy sew a coat Faceted search supports exploring large amounts of data by displaying summaries about various partitions of the data and later allowing to narrow the navigation to a specific partition. Let there be a father that wants to sew a rain coat exactly as the one below for his one year old daughter. Daddy opens his web browser in the middle of the night after having completed all baby care and feeding duties and starts to search for fabric in an online retailer. But what to search for, how to quicky find the best fabric out of 100.000 offered by the online store? Many stores will display a search box and a faceted navigation at the left side. Possible facets (partitions, dimensions) are: Material Blends (16647) Cotton (16762) … Usage Baby (19913) Dress (20005) … Price Less than 10 Euros (13815) 10-20 Euros (16207) … A faceted document search using SQL Assume the products are stored in the database using (JSON) documents. All documents are in JSON column called “product” of a table called “products”. How to get the totals for all the facets using SQL? { "title": "Beach motive M182-16" "description": ... "weight": ... "material": "cotton" ... } The SQL for one facet is pretty much straight-forward: SELECT JSON_UNQUOTE(JSON_EXTRACT(product, "$.material")) AS facet_value, COUNT(*) AS facet_count FROM products WHERE JSON_EXTRACT(product, "$.material") IS NOT NULL GROUP BY JSON_EXTRACT(product, "$.material") To get the values for more than one facet using no more than one query call in your application, combine the results for each facet using UNION ALL. Add a column “facet” to mark the orgininating facet in the combined row result. SELECT "material" AS facet, JSON_UNQUOTE(JSON_EXTRACT(product, "$.material")) AS facet_value, COUNT(*) AS facet_count FROM products WHERE JSON_EXTRACT(product, "$.material") IS NOT NULL GROUP BY JSON_EXTRACT(product, "$.material") UNION ALL SELECT "usage" AS facet, JSON_UNQUOTE(JSON_EXTRACT(product, "$.usage")) AS facet_value, COUNT(*) AS facet_count FROM products WHERE JSON_EXTRACT(product, "$.usage") IS NOT NULL GROUP BY JSON_EXTRACT(product, "$.usage") Speeding it up and shortening the SQL There’s quite a bit to come and the length of the SQL statement will grow. Let’s shorten it a bit by adding generated columns (5.7.6) and speed things up using indicies. Long story short, here’s the CREATE TABLE statement I’m using for this blog post: CREATE TABLE `products` ( `product_id` int(11) NOT NULL AUTO_INCREMENT, `product` json DEFAULT NULL, `facet_size` varchar(255) GENERATED ALWAYS AS (json_unquote(json_extract(`product`,'$.size'))) VIRTUAL, `facet_material` varchar(255) GENERATED ALWAYS AS (json_unquote(json_extract(`product`,'$.material'))) VIRTUAL, `facet_usage` varchar(255) GENERATED ALWAYS AS (json_unquote(json_extract(`product`,'$.usage'))) VIRTUAL, `facet_gender` varchar(255) GENERATED ALWAYS AS (json_unquote(json_extract(`product`,'$.gender'))) VIRTUAL, `facet_price` double GENERATED ALWAYS AS (json_unquote(json_extract(`product`,'$.price'))) VIRTUAL, PRIMARY KEY (`product_id`), KEY `idx_facet_size` (`facet_size`), KEY `idx_facet_material` (`facet_material`), KEY `idx_facet_usage` (`facet_usage`), KEY `idx_facet_gender` (`facet_gender`), KEY `idx_face_price` (`facet_price`) ) Using the above the query to gather one facets totals is shortened to: SELECT facet_material AS facet_value, count(*) AS facet_count FROM products WHERE facet_material IS NOT NULL GROUP BY facet_material A range based face: price The price facet in the fabric online shop is range based. In Germany, the home of the example daddy, fabrics are sold per meter. The typical price of a fabric ranges from a few Euros up to 100 Euros with the majority topping around 40 Euros. The show owner wants to display totals for prices from 0..10, 10..20, 20..50, 50+ Euros. Note that the ranges are of different size. Of course, SQL can deliver! For example, use a subquery and CASE expressions. Stackoverflow will guide a developer within seconds. SELECT "price" AS facet, fpsub.facet_price_range AS facet_value, COUNT(*) AS facet_count FROM (SELECT (CASE WHEN facet_price BETWEEN 0 AND 10 THEN "0-10" WHEN facet_price BETWEEN 10.01 AND 20 THEN "10-20" WHEN facet_price BETWEEN 20.01 AND 50 THEN "20-50" ELSE "50+" END) AS facet_price_range FROM products WHERE facet_price IS NOT NULL) AS fpsub GROUP BY fpsub.facet_price_range Combine all the individual facet queries using UNION ALL and you have the basic search. Refinement: user selects a facet value Example Daddy made up his mind on the material for the inner jacket, selects “cotton” and enters “Beach” into the search box. Likely, the users expectation is that the selection will have no impact on the totals shown for any other material but cotton. In other words the material face query needs to be split in two: one to count the total for “cotton” and “Beach” and another one for the totals of all the other facet values. Note that I ignore the problem of full text search and use LIKE exactly how one should not use it. SELECT facet_material, count(*) FROM products WHERE facet_material = "cotton" AND JSON_UNQUOTE(JSON_EXTRACT(product, '$.title')) LIKE "%Beach%" GROUP BY facet_material UNION ALL SELECT facet_material, COUNT(*) FROM products WHERE facet_material != "cotton" AND facet_material IS NOT NULL GROUP BY facet_material The grand total Proof: you can use SQL for the job. SELECT "material" AS facet, facet_material AS facet_value, COUNT(*) AS facet_count FROM products WHERE facet_material = "cotton" AND JSON_UNQUOTE(JSON_EXTRACT(product, '$.title')) LIKE "%Beach%" GROUP BY facet_material UNION ALL SELECT "material" AS facet, facet_material AS facet_value, COUNT(*) AS facet_count FROM products WHERE facet_material IS NOT NULL AND facet_material != "cotton" GROUP BY facet_material UNION ALL SELECT "price" AS facet, fpsub.facet_price_range AS facet_value, COUNT(*) AS facet_count FROM (SELECT (CASE WHEN facet_price BETWEEN 0 AND 10 THEN "0-10" WHEN facet_price BETWEEN 10.01 AND 20 THEN "10-20" WHEN facet_price BETWEEN 20.01 AND 50 THEN "20-50" ELSE "50+" END) AS facet_price_range FROM products WHERE facet_price IS NOT NULL) AS fpsub GROUP BY fpsub.facet_price_range UNION ALL SELECT "size" AS facet, facet_size AS facet_value, COUNT(*) AS facet_count FROM products WHERE facet_size IS NOT NULL GROUP BY facet_size UNION ALL SELECT "usage" AS facet, facet_usage AS facet_value, COUNT(*) AS facet_count FROM products WHERE facet_usage IS NOT NULL GROUP BY facet_usage UNION ALL SELECT "gender" AS facet, facet_gender AS facet_value, COUNT(*) AS facet_count FROM products WHERE facet_gender IS NOT NULL GROUP BY facet_gender ORDER BY facet, facet_value +----------+------------------+-------------+ | facet | facet_value | facet_count | +----------+------------------+-------------+ | gender | female | 33387 | | gender | male | 33327 | | material | blends | 16647 | | material | designer fabrics | 16703 | | material | knits | 16739 | | material | silk | 16594 | | material | wool | 16555 | | price | 0-10 | 13815 | | price | 10-20 | 16207 | | price | 20-50 | 55668 | | price | 50+ | 14310 | | size | 114cm | 14464 | | size | 140cm | 14366 | | size | L | 14009 | | size | M | 14303 | | size | S | 14327 | | size | XL | 14211 | | size | XXL | 14320 | | usage | baby | 19913 | | usage | dress | 20005 | | usage | inside | 19929 | | usage | outdoor | 20166 | +----------+------------------+-------------+ 22 rows in set (0,34 sec) The execution time of 0.34s? Well, this is a source build of MySQL running in a VM on a notebook. No server parameters set. Inacceptable result for interactive search experience. How the DevAPI could make a difference A DevAPI counterpart of the above could read very different. products .find("title LIKE :search") .facets({ "usage", "material", "gender", "size", "price" { "0-10": {"min": 0, "max": 10"} ... } .bind("search", "%Beach%") As a developer, I could immediately grasp what the higher level DevAPI does. I would have no clue what the SQL is about if the column aliases would not hint me. Note also that the DevAPI and the X plugin are not limited to SQL. They could, if they want, use the lower level MySQL internal data access APIs. The internal APIs can be faster than SQL, way faster. SQL runs on top of them. Dear Sales, All, the DevAPI is only a tiny bit of the MySQL document store story. You may see it as a dump SQL wrapper of little value. Or, as a chance to offer an alternative query API that solves web developer problems much easier than raw SQL could do. Happy hacking! @Ulf_Wendel The post Faceted search, why the DevAPI could matter one day appeared first on Ulf Wendel.
-
MySQL Community Server 5.6
MySQL Community Server 5.6 (5.6.32 GA, published on Friday, 29 Jul 2016)