Trading ORA-30926 for ORA-08006

Trading ORA-30926 for ORA-08006

This one has bitten me this morning when violating against golden rule #3 (‘Never change two not completely predictable things at a common target simultaneously at least unless you are in hazardous mood.’ An even more popular exemplification of that rule would be the slogan ‘Don’t drink & drive.’).

So I changed

  • the semantics of one source table of a merge statement (extract individual phrases out of a text field into multiple rows)
  • the partitioning of the target table (including the enablement row movement)

The result of the former change alone should have produced an ORA-30926 (‘unable to get a stable set of rows’) when some … hmm, say humanoid at least … managed to restate the same phrase over and over again in the source system.

In conjunction with row movement, however, the merge statement will instead issue an ORA-08006 (‘Someone deleted that row. It wasn’t me.’) which leaves one puzzled until one manages to find Todor Botevs helpful investigation.

 

 

 

Sampling consistently out of (unordered, splitted, fragmented) detail data

One of the problems when sampling data out of raw files is that there maybe consistency constraints on the picked lines.

hashrandomFor example, if you like to extract representative sales slips out of TLOG texts, you want all the positions belonging to one sales slip to be sampled as a whole … and you want that property consistently over all processes/machines onto which the texts/sampling has been distributed.

Using the typically seeded pseudo-random sequences will not work as expected, here. You would have to first aggregate the sales slip headers, sample on the level of headers and, again, join the resulting sample with the original TLOG data.

A nice idea that circumvents that necessity with only a minimal bit of overhead is inspired by Chuck Lam’s method of simply constructing Bloom-filter hash-functions. For each line, the pseudo-random generator is seeded with the hash-code of a given key/object. Then, a fixed position of the random sequence (the new “seed”) is read as the observed random value.

For our TLOG case, all the lines carrying the identical sales slip key will get the same random boolean computed and get filtered or passed through allthesame.

Hadoop Quick (Default) Port Reference & Help datanode startup on NTFS/Windows 8

Hadoop Quick (Default) Port Reference

This comes quite handy, e.g., when you need to setup firewall rules.

And here is a post related to the filesystem permission checks that the datanode service performs upon startup.

On my Windows 8 machine, I had to add the following lines to the hdfs-site.xml that comes with hortonworks hdp 1 distribution

<property>
 <name>dfs.datanode.data.dir.perm</name>
 <value>700</value>
 <description>The permissions that should be there on dfs.data.dir
 directories. The datanode will not come up if the permissions are
 different on existing dfs.data.dir directories. If the directories
 don't exist, they will be created with this permission.
 </description>
 </property>

Smarter than the average MSI (or: Running HDP-1.1.0-GA on Windows 2008 Server)

This is so great.

  1. Like to try out HDInsight Server Preview?
  2. Already got access to some “quite modern” Windows 2008 Server  with most of the SDK-, .NET- and JDK-orgy set up?
  3. Unless … this wonderful piece of hardware just resides in the intranet and the web-installer is not an option.
  4. Ummpf. So, the M$-infrastructure partner also provides its own Hadoop Data Platform for Windows distribution.
  5. You also successfully managed to install the rest of the prerequisites (please don’t let the admins reveil that, especially the remote powershell part) and configuration issues.
  6. And all this %§$”%- MSI responds is “Visual Studio C++ Redistributable (x64) Package not installed” while it is indeed.

This is soo great. See, you are not alone.

I simply can’t believe that a distribution of a managed code framework depends on a particular minor version of the underlying operating systems.

I simply can’t believe that a disfunctioning MSI is a dead end of a once promising long road.

Rightly believed. Here is the fundamental trick how to tweak such a package using the ORCA Tool shipped with the Windows SDK.

Looking into the LaunchCondition Section and see the too malicious tests on the Visual C++ redistributable and the operating system version.

msi_before_orca

Delete the vc_redist dependency and adapt the operating system version accordingly.

msi_after_orca

And be sure that your JAVA_HOME (the environment path, not the shell variable, as the installer spawns a new environment during its run) does neither contain spaces (like “Program Files”) nor double quotes – a lesson that can be learned when inspecting closely the resulting installer logs.

Installation completed successfully.