But isn't data mining more or less illogical? I'm sure you could find all sorts of odd things in any given datapool, but how do you establish that there's actually some sort of cause and effect? [snip]
Excellent thought process in those questions! As any woo will tell you, predicting the future can be
so darn tough.

Cause and effect is indeed a problem. Data mining models must be trained and then tested against real-world data.
There are famous examples of spurious connections found. There is the (perhaps apocryphal) story of the best predictor of a famous US stock market average being the price of yak butter in Nepal. Correlation is not necessarily causation.
One of my favorite failures in data mining is a neural net algorithm used to have a computer automatically recognize friendly tanks from enemy tanks under battlefield conditions. They trained it by feeding it pictures of various tanks. When they tested it in the field, it was sure every tank was an enemy tank. After much research, it was found that the training pictures of friendly tanks were all nicely posed pictures in good lighting, not moving, etc. All the enemy pictures were of poorer lighting/quality. Oops. Also note that the choice of mining algorithm strongly impacts the types of questions that can be answered. Neural nets are great at adapting to changing data, but terrible and explaning "why."
The fact that someone bought an extended car warranty is a pretty darn good predictor they bought the car, but useless for business purposes. Buying a car is not a useful predictor of whether or not the warranty will be purchased. There must be other factors at work. Can one conclude that association analysis (people who bought X also bought Y) is useless? No. For many applications, correlation is often good enough. I don't care WHY people who buy A also buy B, but I can run a test of store layout and see if I increase the overall "attach rate" by moving A and B closer together. And since warranties are so darn profitable, you can guarantee I am going to spend some time and money seeing if I can find a usefull predictor that will increase the warranty purchase rate. Is it income level? Is it age? Is it education level? If I can identify something that gives me even 1% greater purchases of warranties, I will make the company millions of dollars. And I will get the hearty thanks of my company. Those thanks and $5 will get me a cup of coffee at Starbucks.
For a supermarket example, if I know people will usually buy A and B, perhaps I can move the products far away from one another to increase the amount of time in the store and the number of other products you have to see on the way. Now I am testing not the attach rate of A and B, but the overall profitability of the trip - a.k.a. "Basket Analysis." Another way to phrase this is "what is the overall impact on a shopping trip (total or percent profit) if someone buys X?" Both basket analysis and attach rate analysis are a big deal in the retail industry.
Wouldn't that merely be a source of data that makes you want to start an actual study on it than actually enough to reach a conclusion?
YES! You win the prize. Would you please join the management team of my company? You would be surprised how few people understand that.
And of course, to do proper testing, suddenly you are getting back down to the level of individual transactions studied in detail and in significant numbers in a controlled situation.
As the data mining analyst, I still don't care about the individual transaction. I test individual transactions to the model to see if statistically I get an expected percentage of correct answers.
It seems to me this data mining thing is as ineffective in marketting as it is in any other scientific discipline.
It depends upon what you want to predict and how good your input sample is.
I must also take exception to the word "ineffective" when used with "any other scientific displine," especially medicine. While you cannot guarantee that a particular case of disease X was cured by treatment Y, I have a pretty good chance of excluding other treatments as likely candidates by sampling the data. I don't care if a particular person had a miraculous spontaneous remission - if 80% of the patients treated with Y get better compared to 20% of untreated patients, western medicine is going to move to the treatment Y as the standard of care until either a better predictor is found or some better treatment comes along. (Sounds like a job for a decision tree or clustering algorition).
Further, if Amazon's suggestions are based on datamining, well that says it all doesn't it? I've almost never found myself interested in what other people also bought, as they tend to rarely be related to what I got. Further, Amazon has at a time suggested that one "Possible source of inspiration" for Mozart was the band Queen.
My (only half-joking) comment is that JREF is a pretty odd subspecies of consumer. Amazon does not care if YOU buy Queen and Mozart - they have found that enough people like both to be profitable showing this to anyone that buys one of them. And yes, if Amazon is positing that Mozart channelled a band from several hundred years in the future, then yes, they need to add a temporal check on their mining model rules.
The science of data mining is still fairly young. There are only now decent tools to make it a worthwhile area for mainstream businesses to consider. And it will ALWAYS take poeple who knows how to set up the cases, pick a good algorithm, test the results with known data, and then DESIGN A REAL WORLD test with unknown data to make it worthwhile. With OLAP, it is often easy to tell a new customer that with some expert help, we can have key performance indicators (KPIs) of your business on the executive's dashboard in a matter of weeks. Return on invenstment (ROI) can often be measured fairly quickly and easily.
Data mining is a tougher nut to crack. You may spend a lot of time and confirm that the data you have does not predict anything. You can then go back and collect/purchase more data - customer surveys, demographics, competing market data, etc., and see if you can find better predictors. But management will not often say - "Thanks for spending 3 months and $1 million to come up with nothing. Here's more money." It takes management discipline to go after this stuff.
-----
I promise I really can make concise posts. You have just stumbled onto the only deep area of experience I have (that is usefull for financial gain). As penance I will do 10 reply posts consisting of nothing but "Evidence?".
CriticalThanking